Contexts in source publication

Context 1
... joint sample contains 741 real and 6 fabricated interview clusters and therefore a share of under 1% fakers. Figure 4 provides pairwise scatterplots between all features in the training set, where real interviews are marked as small, gray dots. For visibility purposes fakers are marked as larger, blue dots. ...
Context 2
... D shows pairwise scatterplots and kernel density estimates for the synthetic sample. Overall, the tendencies are similar to those illustrated in figure 4 and it is apparent that the sampling is not random but synthetically based upon the position of the original observations in the feature space. After synthetic sampling the logistic model is fitted for the training set in order to predict probabilities for fakers in sample E 1999 and F 2000. ...

Similar publications

Article
Full-text available
Flat lenses enable thinner, lighter, and simpler imaging systems. However, large-area and high-NA flat lenses have been elusive due to computational and fabrication challenges. Here we applied inverse design to create a multi-level diffractive lens (MDL) with thickness $ \lt {1}.{35}\;\unicode{x00B5} {\rm m}$ < 1 . 35 µ m , diameter of 4.13 mm, and...
Poster
Full-text available
This works details of our home made low cost waveguide fabrication setup.

Citations

... Given the importance of good data, effective methods are needed to sanitize survey data before conducting any meaningful empirical research or making decisions. Several methods have been proposed to detect bad answers in surveys, which involve re-interviewing respondents, recording and analyzing interviews with respondents, or analyzing statistical features of the responses [7]. However, the application of such verification checks incurs high costs. ...
Chapter
Full-text available
Surveys are one of the most common ways of collecting data on individuals. Such data are of great value for economic and social research. However, the quality of the decisions and research results based on survey data depends on the ability to detect and filter out bad answers. The most common source of bad data are the respondents, who might provide imprecise or fabricated answers due to several reasons. In this paper we present a method to sanitize survey data that relies on combining the classification outcomes of three unsupervised machine learning algorithms (DBSCAN, PCA and IForest) aimed at detecting bad answers. Empirical results on real data show that our approach is able to improve the detection of both completely and partially bad answers with respect to the results provided by each algorithm independently.