ArticlePublisher preview available

Faking and the Validity of Conscientiousness: A Monte Carlo Investigation

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

The article reports the findings from a Monte Carlo investigation examining the impact of faking on the criterion-related validity of Conscientiousness for predicting supervisory ratings of job performance. Based on a review of faking literature, 6 parameters were manipulated in order to model 4,500 distinct faking conditions (5 [magnitude] x 5 [proportion] x 4 [variability] x 3 [faking-Conscientiousness relationship] x 3 [faking-performance relationship] x 5 [selection ratio]). Overall, the results indicated that validity change is significantly affected by all 6 faking parameters, with the relationship between faking and performance, the proportion of fakers in the sample, and the magnitude of faking having the strongest effect on validity change. Additionally, the association between several of the parameters and changes in criterion-related validity was conditional on the faking-performance relationship. The results are discussed in terms of their practical and theoretical implications for using personality testing for employee selection.
Faking and the Validity of Conscientiousness: A Monte Carlo Investigation
Shawn Komar, Douglas J. Brown, and
Jennifer A. Komar
University of Waterloo
Chet Robie
Wilfrid Laurier University
The article reports the findings from a Monte Carlo investigation examining the impact of faking on the
criterion-related validity of Conscientiousness for predicting supervisory ratings of job performance.
Based on a review of faking literature, 6 parameters were manipulated in order to model 4,500 distinct
faking conditions (5 [magnitude] 5 [proportion] 4 [variability] 3 [faking–Conscientiousness
relationship] 3 [faking–performance relationship] 5 [selection ratio]). Overall, the results indicated
that validity change is significantly affected by all 6 faking parameters, with the relationship between
faking and performance, the proportion of fakers in the sample, and the magnitude of faking having the
strongest effect on validity change. Additionally, the association between several of the parameters and
changes in criterion-related validity was conditional on the faking–performance relationship. The results
are discussed in terms of their practical and theoretical implications for using personality testing for
employee selection.
Keywords: faking, simulation, validity, personality, selection
Research into the use of personality measures for employee
selection has expanded since the early 1990s when meta-
analytic reviews demonstrated their utility for predicting work
behaviors (Barrick & Mount, 1991; Hurtz & Donovan, 2000;
Tett, Jackson & Rothstein, 1991). Recent work has further
bolstered interest, demonstrating that some personality dimen-
sions incrementally predict job performance beyond cognitive
ability (Schmidt & Hunter, 1998, 2004) and that, unlike cogni-
tive ability, personality tests do not evidence meaningful sub-
group (e.g., sex and ethnicity) differences (Hough, 1998). With
expanding usage has come growing controversy regarding the
limitations of personality testing. Because it is impossible to
verify the accuracy of applicants’ responses to personality items
and because these items are typically transparent, some authors
have noted that respondent faking may be commonplace (Levin
& Zickar, 2002; Rosse, Stecher, Miller, & Levin, 1998). From
a selection perspective, faking may be quite problematic as it
can alter the true rank ordering of individuals, resulting in the
hiring of less-qualified people and weakening predictive valid-
ity (Rosse et al., 1998). Because Conscientiousness may be the
single best personality predictor of work performance (Schmidt
& Hunter, 1998), but is among the most susceptible personality
dimensions to faking (McFarland & Ryan, 2000), and because
the legal defensibility of selection tests often depends on
criterion-related validity (Guion, 1998), we investigated how
faking influences the criterion-related validity of Conscien-
tiousness.
Although field and experimental studies have advanced our
understanding of faking, these methodologies have limitations
when investigating faking and criterion-related validity. Tradi-
tional methodologies do not permit researchers to systemati-
cally manipulate participants’ response distortion; they make it
nearly impossible to separate response distortion from the true
level of a trait; and they allow only a restricted number of
parameters to be assessed at one time. When conventional
techniques have been of limited utility, personnel psychologists
have turned to Monte Carlo simulations (e.g., Murphy &
Shiarella, 1997; Roth, Bobko, Switzer, & Dean, 2001), a pro-
cedure that allows experimenters to generate data to explore
hypothesized relationships (Robie & Komar, 2007). Unlike
previous faking simulations that have examined faking correc-
tions (Schmitt & Oswald, 2006), we utilized simulation tech-
niques to investigate how six faking parameters impact the
criterion-related validity of Conscientiousness. Although some
simulation work has investigated faking and personality test
validity (e.g., Zickar, Rosse, & Levin, 1996), these efforts have
been criticized for ignoring key parameters (McFarland &
Ryan, 2000; Smith & Robie, 2004). We extend prior work by
modeling a fuller range of parameters to better understand
which factors matter most when assessing the impact of faking
on criterion-related validity as well as documenting when fak-
ing might exert the most influence. As a framework for under-
Shawn Komar, Douglas J. Brown, and Jennifer A. Komar, Department
of Psychology, University of Waterloo; Chet Robie, School of Business &
Economics, Wilfrid Laurier University.
This research is based on Shawn Komar’s master’s thesis, conducted
under the supervision of Douglas J. Brown. This research was supported by
a grant from the Social Sciences and Humanities Research Council of
Canada to Chet Robie and Douglas J. Brown, and by the facilities of the
Shared Hierarchical Academic Research Computing Network (SHARC-
NET; www.sharcnet.ca). A draft of this article was presented at the 22nd
Annual Conference of the Society for Industrial and Organizational Psy-
chology, April 2007. We thank Winfrid Arthur, Jr., Ann Marie Ryan, Jill
Ellingson, Daniel Heller, John Michela, Jonathan Oakman, and Michael
Biggs for their assistance.
Correspondence concerning this article should be addressed to Shawn
Komar, Department of Psychology, University of Waterloo, 200 University
Avenue West, Waterloo, Ontario, Canada N2L 3G1. E-mail:
sgkomar@uwaterloo.ca
Journal of Applied Psychology Copyright 2008 by the American Psychological Association
2008, Vol. 93, No. 1, 140–154 0021-9010/08/$12.00 DOI: 10.1037/0021-9010.93.1.140
140
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
... Personality traits assessed on these inventories explain important differences in work behavior and are useful predictors of job-related outcomes (e.g., Christiansen & Tett, 2013;Gatewood et al., 2016;Judge et al., 2013;Shaffer & Postlethwaite, 2012;Tett & Christiansen, 2007;Zimmerman, 2008). However, there have long been concerns over the use of personality tests in highstakes settings because such assessments are easily faked (e.g., Griffith & Robie, 2013), resulting in score distortion and decrements in validity Hendy et al., 2021;Holden, 2008;Jeong et al., 2017;Komar et al., 2008;Peterson et al., 2011;Schmit & Ryan, 1993). Concerns over faking have led to exploration of measurement methods that are more resistant to the effects of faking. ...
... Given their reduced susceptibility to the effects of faking, FC scores are likely better suited for use in high-stakes settings than SS measures. Faking is harmful to test validity and score interpretation Hendy et al., 2021;Holden, 2008;Jeong et al., 2017;Komar et al., 2008;Peterson et al., 2011;Schmit & Ryan, 1993), and thus, using a format with greater resistance to faking is desirable. ...
Article
Full-text available
Forced-choice (FC) personality assessments have shown potential in mitigating the effects of faking. Yet despite increased attention and usage, there exist gaps in understanding the psychometric properties of FC assessments, and particularly when compared to traditional single-stimulus (SS) measures. The present study conducted a series of meta-analyses comparing the psychometric properties of FC and SS assessments after placing them on an equal playing field—by restricting to only studies that examined matched assessments of each format, and thus, avoiding the extraneous confound of using comparisons from different contexts (Sackett, 2021). Matched FC and SS assessments were compared in terms of criterion-related validity and susceptibility to faking in terms of mean shifts and validity attenuation. Additionally, the correlation between FC and SS scores was examined to help establish construct validity evidence. Results showed that matched FC and SS scores exhibit strong correlations with one another (ρ = .69), though correlations weakened when the FC measure was faked (ρ = .59) versus when both measures were taken honestly (ρ = .73). Average scores increased from honest to faked samples for both FC (d = .41) and SS scores (d = .75), though the effect was more pronounced for SS measures and with larger effects for context-desirable traits (FC d = .61, SS d = .99). Criterion-related validity was similar between matched FC and SS measures overall. However, when considering validity in faking contexts, FC scores exhibited greater validity than SS measures. Thus, although FC measures are not completely immune to faking, they exhibit meaningful benefits over SS measures in contexts of faking.
... Additionally, biodata scores have been shown to be susceptible to impression management, such that respondents can fake them to appear more desirable when motivated to do so (e.g., Becker & Colquitt, 1992). This is concerning, given that biodata validity is lower in contexts where respondents are motivated to fake, such as in applicant contexts , and similar self-report methods such as personality assessments display diminished evidence of validity when faked Christiansen et al., 2021;Hendy et al., 2021;Holden, 2008;Jeong et al., 2017;Komar et al., 2008;Peterson et al., 2011;Schmit & Ryan, 1993;Speer et al., in press). Zhang and Kuncel (2020) did not investigate differences in biodata validity across incumbent and applicant contexts. ...
... With self-report personality assessments for example, job applicants score higher than job incumbents (Birkeland et al., 2006). Perhaps more concerningly, there are decrements in personality validity when faking occurs Christiansen et al., 2021;Hendy et al., 2021;Holden, 2008;Jeong et al., 2017;Komar et al., 2008;Peterson et al., 2011;Schmit & Ryan, 1993). ...
... This is important in the study of applicant distortion of personality test scores because there is not consensus as to which applicants fake and by how much. Simulations have generally found that faking will degrade linear construct relationships (e.g., Komar et al., 2008;Marcus, 2006;Schmitt & Oswald, 2006;Zickar et al., 1996). ...
Article
Full-text available
Two field studies were conducted to examine how applicant faking impacts the normally linear construct relationships of personality tests using segmented regression and by partitioning samples to evaluate effects on validity across different ranges of test scores. Study 1 investigated validity decay across score ranges of applicants to a state police academy ( N = 442). Personality test scores had nonlinear construct relations in the applicant sample, with scores from the top of the distribution being worse predictors of subsequent performance but more strongly related to social desirability scores; this pattern was not found for the partitioned scores of a cognitive test. Study 2 compared the relationship between personality test scores and job performance ratings of applicants ( n = 97) to those of incumbents ( n = 318) in a customer service job. Departures from linearity were observed in the applicant but not in the incumbent sample. Effects of applicant distortion on the validity of personality tests are especially concerning when validity decay increases toward the top of the distribution of test scores. Observing slope differences across ranges of applicant personality test scores can be an important tool in selection.
... This is particularly concerning in selection settings, where applicants are motivated to fake their responses to be hired. The resulting responses could alter the rank order of job applicants and distort the factor structure, reliability, and validity evidence of personality tests, consequently harming the utility of selection systems (e.g., Birkeland et al., 2006;Komar et al., 2008;Zickar et al., 2004). ...
Article
Full-text available
Human resource (HR) practices have been focused on using assessments that are robust to faking and response biases associated with Likert‐type scales. As an alternative, multidimensional forced‐choice (MFC) measures have recently shown advances in reducing faking and response biases while retaining similar levels of validity to Likert‐type measures. Although research evidence supports the effectiveness of MFC measures, fairness issues resulting from gender biases in the use of MFC measures have not yet been investigated in the literature. Given the importance of gender equity in HR development, it is vital that new assessments improve upon known gender biases in the historical use of Likert‐type measures and do not lead to gender discrimination in HR practices. In this vein, our investigation focuses specifically on potential gender biases in the use of MFC measures for HR development. Specifically, our study examines differential test‐taker reactions and differential prediction of self‐assessed leadership ability between genders when using the MFC personality measure. In an experimental study with college students, we found no evidence of gender differences in test‐taker reactions to MFC measures. In a second cross‐sectional study with full‐time employees, we found evidence of intercept differences, such that females were frequently underpredicted when using MFC personality measures to predict self‐assessed leadership ability. Moreover, the pattern of differential prediction using MFC measures was similar to that of Likert‐type measures. Implications for MFC personality measures in applied practice are discussed.
... Faking refers to "the tendency to deliberately present oneself in a more positive manner than is accurate in order to meet the perceived demands of the testing situations" (Fan et al., 2012, p. 867). It has been suggested that faking, as situationally induced response distortion, may introduce construct-irrelevant variance to personality scores and hence may attenuate the criterion-related validity of personality scores (Hough et al., 1990;Komar et al., 2008;Mueller-Hanson et al., 2003). ...
Article
Although research has consistently shown that warnings against faking are effective in reducing faking and improving individual hiring decisions, whether warnings may thus boost the criterion-related validity of personality scores has remained unclear. The present study investigates this important issue through a field experiment. Participants were 188 applicants for the MBA program at a Chinese University, who completed a Big Five personality inventory during the campus interview. A warning message and a control message were randomly assigned at the beginning of the personality test. Participants' first-semester academic grades were used as the criterion. Results indicated that (a) applicant faking (measured via an impression management scale) negatively impacted the criterion-related validity of personality scores; (b) warnings lowered applicant faking; and (c) applicant faking mediated the effect of warnings on the criterion-related validity. However, the overall effect of warnings on criterion-related validity was not significant, suggesting that warnings might have yielded some unintended effects that harmed criterion-related validity.
... Faking refers to "the tendency to deliberately present oneself in a more positive manner than is accurate in order to meet the perceived demands of the testing situations" (Fan et al., 2012, p. 867). It has been suggested that faking, as situationally induced response distortion, may introduce constructirrelevant variance to personality scores and hence may attenuate the criterion-related validity of personality scores (Hough et al., 1990;Komar et al., 2008;Mueller-Hanson et al., 2003). ...
Preprint
Full-text available
Although research has consistently shown that warnings against faking are effective in reducing faking and improving individual hiring decisions, whether warnings may thus boost the criterion-related validity of personality scores has remained unclear. The present study investigates this important issue through a field experiment. Participants were 188 applicants for the MBA program at a Chinese University, who completed a Big Five personality inventory during the campus interview. A warning message and a control message were randomly assigned at the beginning of the personality test. Participants' first-semester academic grades were used as the criterion. Results indicated that (a) applicant faking (measured via an impression management scale) negatively impacted the criterion-related validity of personality scores; (b) warnings lowered applicant faking; and (c) applicant faking mediated the effect of warnings on the criterion-related validity. However, the overall effect of warnings on criterion-related validity was not significant, suggesting that warnings might have yielded some unintended effects that harmed criterion-related validity.
... On the one hand, some authors argued that faking does not affect the validity of personality measures (e.g., Hough, 1998b;Komar et al., 2008;Tett & Simonet, 2021;Weekley et al., 2004). On the other hand, extensive empirical evidence suggests that faking affects the mean structure, the covariance structure and criterion validity of self-report measures of personality (e.g., Christiansen et al., 2021;Donovan et al., 2014;Geiger et al., 2018;Krammer et al., 2017;MacCann et al., 2017;Pauls & Crost, 2005;Schmit & Ryan, 1993). ...
Article
Full-text available
A key finding in personnel selection is the positive correlation between conscientiousness and job performance. Evidence predominantly stems from concurrent validation studies with incumbent samples but is readily generalized to predictive settings with job applicants. This is problematic because the extent to which faking and changes in personality affect the measurement likely vary across samples and study designs. Therefore, we meta-analytically investigated the relation between conscientiousness and job performance, examining the moderating effects of sample type (incumbent vs. applicant) and validation design (concurrent vs. predictive). The overall correlation of conscientiousness and job performance was in line with previous meta-analyses ((Formula presented.)). In our analyses, the correlation did not differ across validation designs (concurrent: (Formula presented.); predictive: (Formula presented.)), sample types (incumbents: (Formula presented.); applicants: (Formula presented.)), or their interaction. Critically, however, our review revealed that only a small minority of studies (~12%) were conducted with real applicants in predictive designs. Thus, barely a fraction of research is conducted under realistic conditions. Therefore, it remains an open question if self-report measures of conscientiousness retain their predictive validity in applied settings that entail faked responses. We conclude with a call for more multivariate research on the validity of selection procedures in predictive settings with actual applicants. © 2022 The Authors. International Journal of Selection and Assessment published by John Wiley & Sons Ltd.
... On the one hand, some authors argued that faking does not affect the validity of personality measures (e.g., Hough, 1998b;Komar et al., 2008;Tett & Simonet, 2021;Weekley et al., 2004). On the other hand, extensive empirical evidence suggests that faking affects the mean structure, the covariance structure and criterion validity of self-report measures of personality (e.g., Christiansen et al., 2021;Donovan et al., 2014;Geiger et al., 2018;Krammer et al., 2017;MacCann et al., 2017;Pauls & Crost, 2005;Schmit & Ryan, 1993). ...
Preprint
A key finding in personnel selection is the positive correlation between conscientiousness and job performance. Evidence predominantly stems from concurrent validation studies with incumbent samples but is readily generalized to predictive settings with job applicants. This is problematic because the extent to which faking and changes in personality affect the measurement likely vary across samples and study designs. We meta-analytically investigated the relation between conscientiousness and job performance, examining the moderating effects of sample type (incumbent vs. applicant) and validation design (concurrent vs. predictive). Our review of the published literature reveals that only a small minority of studies were conducted with real applicants in predictive designs, which questions the generalizability of the findings to real selection processes. However, the overall correlation of conscientiousness and job performance was in line with previous meta-analyses (𝑟̅ = .17, k = 102, n = 23,305) and this effect was not moderated by either validation design (concurrent: 𝑟̅ = .18, k = 78, n = 19,132; predictive: 𝑟̅ = .15, k = 24, n = 4,173), sample type (incumbents: 𝑟̅ = .18, k = 92, n = 20,808; applicants: 𝑟̅ = .14, k = 10, n = 2,497), or the interaction thereof. We discuss how these results are limited by a potentially large file drawer problem in the industry and conclude with a call for more multivariate research on the validity of selection procedures in predictive settings with actual applicants.
Article
This research proposes a faking‐mitigation strategy for situational judgment tests (SJTs), referred to as the constructed response retest (CR‐retest). The CR‐retest strategy involves presenting SJT items in a constructed response format first, followed by equivalent closed‐ended items with the same situation description. Two field experiments ( N 1 = 733, N 2 = 273) were conducted to investigate the effects of this strategy and contrast it with a commonly used pretest warning message. Study 1 revealed that the CR‐retest strategy was more effective than the warning message in reducing score inflation and improving criterion‐related validity. Study 2 delved deeper by investigating the effects of the CR‐retest strategy on applicant reactions in a 2 (with or without CR‐retest strategy) × 2 (warning or control message) between‐subjects design. The results showed that applicants reported positive fairness perceptions on SJT items with the CR‐retest strategy. The CR‐retest strategy was effective in reducing faking by evoking threat perceptions, whereas the warning message heightened threat and fear. Combining two strategies further decreased faking without undermining fairness perceptions. Overall, our results indicate that the CR‐retest strategy could be a valuable method to mitigate faking in real‐life selection settings.
Article
This scientific article discusses slang as a medium of adolescent communication in social interactions between others. The role of language becomes very important in every facet of human life. This article was written with the aim of providing information that slang has become a medium or communication tool for adolescents, also explaining about something that has to do with language as a means of communication in every human social interaction. In writing this article using the literature review method, all data is obtained from relevant literature data. In the discussion of this article explains about slang as a medium of communication among teenagers. The characteristics of slang in the form of words, acronyms and abbreviations, factors that influence the emergence and use of slang as a medium of communication, also explain the impact of the use of slang in social interaction. As social creatures, of course, humans need communication in their lives in order to establish relationships between others. From the writing of this article, it can be seen that the millennial generation or teenagers already have creativity in communicating, both through abbreviations and acronyms in the form of Indonesian and English, so that it can show the context of communication culture in the modern and cool world of teenagers.
Article
Full-text available
This study provides a comprehensive investigation into whether social desirability alters the factor structure of personality measures. The study brought together 4 large data sets wherein different organizational samples responded to different personality measures. This facilitated conducting 4 separate yet parallel investigations. Within each data set, individuals identified through a social desirability scale as responding in an honest manner were grouped together, and individuals identified as responding in a highly socially desirable manner were grouped together. Using various analyses, the fit of higher order factor structure models was compared across the 2 groups. Results were the same for each data set. Social desirability had little influence on the higher order factor structures that characterized the relationships among the scales of the personality measures. Department of Management and Human Resources,.
Chapter
Full-text available
In psychological assessment, we aim for the most accurate description of some cognitive or behavioral attribute. In assessment involving self-reports, this objective is invariably haunted by the possibility of misrepresentation. Certainly we would be sceptical of self-reports of intelligence, perhaps because of its universal desirability. Among the few qualities typically rated as even more desirable than intelligence is having a good personality. Thus it seems dangerous to ignore the possibility that at least some respondents systematically misrepresent their own personality.
Article
A 16-item scale is developed to measure the degree to which salespeople practice adaptive selling—the degree to which they alter their sales presentation across and during customer interactions in response to the perceived nature of the sales situation. This paper-and-pencil scale assesses self-reports of five facets of adaptive selling: (1) recognition that different sales approaches are needed for different customers, (2) confidence in ability to use a variety of approaches, (3) confidence in ability to alter approach during an interaction, (4) collection of information to facilitate adaptation, and (5) actual use of different approaches. The reliability of the scale is .85. Support for the nomological validity of the scale is found by failure to disconfirm relationships with an antecedent (intrinsic motivation), several general personality measures of interpersonal flexibility (self-monitoring, empathy, locus of control, and androgyny), and a consequence (self-reported performance).