ArticlePublisher preview available

Comparative Fit Indexes in Structural Models

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Normed and nonnormed fit indexes are frequently used as adjuncts to chi-square statistics for evaluating the fit of a structural model. A drawback of existing indexes is that they estimate no known population parameters. A new coefficient is proposed to summarize the relative reduction in the noncentrality parameters of two nested models. Two estimators of the coefficient yield new normed (CFI) and nonnormed (FI) fit indexes. CFI avoids the underestimation of fit often noted in small samples for Bentler and Bonett's (1980) normed fit index (NFI). FI is a linear function of Bentler and Bonett's non-normed fit index (NNFI) that avoids the extreme underestimation and overestimation often found in NNFI. Asymptotically, CFI, FI, NFI, and a new index developed by Bollen are equivalent measures of comparative fit, whereas NNFI measures relative fit by comparing noncentrality per degree of freedom. All of the indexes are generalized to permit use of Wald and Lagrange multiplier statistics. An example illustrates the behavior of these indexes under conditions of correct specification and misspecification. The new fit indexes perform very well at all sample sizes.
QUANTITATIVE
METHODS
IN
PSYCHOLOGY
Comparative
Fit
Indexes
in
Structural Models
P.
M.
Bentler
University
of
California,
Los
Angeles
Normed
and
non
normed
fit
indexes
are
frequently
used
as
adjuncts
to
chi-square
statistics
for
evalu-
ating
the fit of a
structural
model.
A
drawback
of
existing
indexes
is
that they estimate
no
known
population
parameters.
A new
coefficient
is
proposed
to
summarize
the
relative
reduction
in the
noncentrality
parameters
of two
nested models.
Two
estimators
of the
coefficient
yield
new
normed
(CFI)
and
nonnormed
(Fl)
fit
indexes.
CFI
avoids
the
underestimation
of fit
often
noted
in
small
samples
for
Bentler
and
Bonett's
(1980)
normed
fit
index
(NFI).
FI
is a
linear function
of
Bentler
and
Bonett's
non-normed
fit
index
(NNFI)
that avoids
the
extreme underestimation
and
overestima-
tion
often
found
in
NNFI. Asymptotically, CFI,
FI,
NFI,
and a new
index developed
by
Bollen
are
equivalent measures
of
comparative
fit,
whereas NNFI
measures
relative
fit by
comparing
noncen-
trality
per
degree
of
freedom.
All of the
indexes
are
generalized
to
permit
use of
Wald
and
Lagrange
multiplier
statistics.
An
example illustrates
the
behavior
of
these indexes under conditions
of
correct
specification
and
misspccification.
The new fit
indexes perform very
well
at all
Sample sizes.
As
is
well
known,
the
goodness-of-fit
test
statistic
T
used
in
evaluating
the
adequacy
of a
structural model
is
typically
re-
ferred
to the
chi-square distribution
to
determine acceptance
or
rejection
of a
specific
null
hypothesis,
S =
2(0).
In the
context
of
covariance
structure analysis,
S is the
population
covariance
matrix
and 0 is a
vector
of
more
basic
parameters,
for
example,
the
factor
loadings
and
intercorrelations
and
unique
variances
in
a
confirmatory
factor
analysis.
The
statistic
T
reflects
the
closeness
of 2 =
S(0),
based
on the
estimator
8, to
the
sample
matrix
S, the
sample covariance matrix
in
covariance structure
analysis,
in the
chi-square metric. Acceptance
or
rejection
of
the
null hypothesis
via a
test based
on T may be
inappropriate
or
incomplete
in
model evaluation
for
several reasons:
1.
Some
basic
assumptions
underlying
Tmay
be
false
and
the
distribution
of the
statistic
may not be
robust
to
violation
of
these assumptions.
2. No
specific model
S(0)
may be
assumed
to
exist
in the
population,
and T is
intended
to
provide
a
summary regarding
closeness
of S to
S,
but not
necessarily
a
test
of S
=
2(0).
3. In
small
samples,
T may not be
chi-square distributed;
hence,
the
probability values used
to
evaluate
the
null
hypothe-
sis
may not be
correct.
This research
was
supported
in
part
by
United States Public Health
Service
Grants
DA01070
and
DA00017
and is
based
on a
February
1988
technical report
and a
paperprescnted
at the
Psychometric Society
meetings,
June
1988,
Los
Angeles.
Helpful
discussions with
J. de
Leeuw,
R. I.
Jennrich,
T. A. B.
Snijders,
and J. A.
Woodward;
the
eomputer assistance
of
Shinn-Tzong
Wu; and
the
production assistance
of
Julie
Speckart
are
gratefully
acknowl-
edged.
Correspondence concerning this article should
be
addressed
to P. M.
Bentler, Department
of
Psychology, University
of
California,
Los
Ange-
les,
California
90024-1563.
4.
In
large samples,
any a
priori hypothesis
2 =
S(0),
al-
though only trivially false,
may be
rejected.
As
a
consequence,
the
statistic
T
may
not be
clearly
interpret-
able,
and
transformations
of T
designed
to map it
into
a
more
interpretable
0-1,
or
approximate
0-1,
range have been devel-
oped.
Those
indexes
are
usually called goodness-of-fit indexes
(e.g.,
Bentler, 1983,
p.
507;
Joreskog
&
Sorbom,
1984,
p.
1.40).
A
related class
of
indexes, here called comparative goodness-of-
fit
indexes,
assess
T in
relation
to the fit of a
more restrictive
model. These comparative
fit
indexes, formalized
by
Bentler
and
Bonett
(1980),
are
very
widely
used (Bentler
&
Bonett,
1987)
and are the
sole object
of
this article. Alternative
ap-
proaches
to
evaluating model adequacy
are
reviewed elsewhere
(e.g.,
Bollen
&
Liang, 1988; Bozdogan, 1987;
LaDu
&
Tanaka,
in
press;
Wheaton,
1987).
Although covariance structure analy-
sis is
emphasized,
the
methods
developed here hold
for
any
type
of
structural model,
including,
for
example,
mean-covariance
structures
and
log-linear models.
Although more than
30 fit
indexes have been reported
and
their
empirical
behavior
studied
(Marsh,
Balla,
&
McDonald,
1988),
and
although
new
ones continue
to be
developed (Bollen,
1989),
it is
surprising
to
note
that
they have been developed
as
purely
descriptive statistics. Apparently,
no
population parame-
ter has
been
defined
that
is
being estimated
by any of the
exist-
ing
indexes.
In
this article,
I
define
an
explicit population
com-
parative
fit
coefficient,
provide
two
alternative
estimators
of the
coefficient,
and
investigate
the
asymptotic relations between
the
new
and
previously defined comparative
fit
indexes. Further-
more,
new
indexes based
on
Wald
and
Lagrange multiplier sta-
tistics
are
developed.
Nested
Models
and
Comparative
Fit
In
evaluating comparative model
fit, it is
helpful
to
focus
on
more than
one
pair
of
models. Consider
a
series
of
nested
models,
Psychological
Bulletin,
1990,
Vol.
107,
No
2.238-246
Copyright
1990
by (he
American
Psychological
Association,
Inc.
0033-2909/90/$00.P;5
238
This document is copyrighted by the American Psychological Association or one of its allied publishers.
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
... The pathway analysis was conducted when controlling for background variables in kindergarten (SES, gender, and kindergarten teacher). To evaluate the fit of the model, the following fit indices were used: chi-square goodness of fit index, comparative fit index (CFI; Bentler, 1990), and the root-mean-square error of approximation (RMSEA, Steiger, 1990). Values close to or greater than .95 ...
Article
Full-text available
Self-regulation (SR; emotion-related, and behavioral), executive function, and theory of mind (ToM) all play an important role in child socioemotional functioning (SEF). However, much remains unknown about the interplay among these abilities when facing various challenging situations. Additionally, the role of these abilities in child SEF has not yet been studied among minority children from an Eastern culture. Thus, we conducted one study with two models to examine the combined contribution of these core abilities, concurrently, to children’s SEF during the transition to kindergarten, and longitudinally (about 3 years later) to children’s SEF during COVID-19. Overall, 202 kindergarten children (aged 4.9–6.5 years) participated, of which 136 of them in the longitudinal follow-up (aged 8.83–10.6 years). We used behavioral tasks and teacher and maternal reports. Mothers also reported their own distress during the COVID-19 pandemic. During the transition to kindergarten, we found that emotion-related SR was positively related to children’s SEF. We also found that emotion-related SR moderated the relation between inhibition and ToM. In the follow-up study, we found that emotion-related SR in kindergarten significantly predicted children’s SEF during the COVID-19 crisis, directly and indirectly, through children’s SEF in kindergarten and their maternal COVID-related distress. Moreover, emotion-related SR moderated the longitudinal association between children’s ToM at kindergarten age and their SEF during the COVID-19 crisis. Our findings highlight the central role that emotion-related SR plays in children’s ability to face different challenges.
... Support for the hypotheses would be evident if the models fit the data well, i.e. RMSEA < 0.06; Standardized Root Mean Square Residual (SRMR) < 0.08; Comparative Fit Index (CFI) > 0.95, the p-value for the χ 2 > 0.05 [137]; and the 95% CIs of effect sizes for the regression coefficients for hypothesized paths did not include zero. We evaluated multiple fit indices because evaluating any single index can be problematic (e.g. a significant χ 2 test does not have to imply the model misfit, as the significance of the test can be affected by many factors, including clustered data, non-normal data big samples; [138][139][140]). ...
Article
Full-text available
Affective responses during stressful, high-stakes situations can play an important role in shaping performance. For example, feeling shaky and nervous at a job interview can undermine performance, whereas feeling excited during that same interview can optimize performance. Thus, affect regulation—the way people influence their affective responses—might play a key role in determining high-stakes outcomes. To test this idea, we adapted a synergistic mindsets intervention (SMI) (Yeager et al. 2022 Nature 607, 512–520 (doi:10.1038/s41586-022-04907-7)) to a high-stakes esports context. Our approach was motivated by the idea that (i) mindsets both about situations and one’s stress responses to situations can be shaped to help optimize stress responses, and (ii) challenge versus threat stress responses will be associated with improved outcomes. After a baseline performance task, we randomly assigned gamers (n = 300) either to SMI or a control condition in which they learned brain facts. After two weeks of daily gaming, gamers competed in a cash-prize tournament. We measured affective experiences before the matches and cardiovascular responses before and throughout the matches. Contrary to predictions, gamers did not experience negative affect (including feeling stressed), thus limiting the capacity for the intervention to regulate physiological responses and optimize performance. Compared with the control participants, synergistic mindsets participants did not show greater challenge responses or improved performance outcomes. Though our adaptation of Yeager et al.’s SMI did not optimize esports performance, our findings point to important considerations regarding the suitability of an intervention such as this to different performance contexts of varying degrees of stressfulness.
... The maximum-likelihood approach was used for model estimation (Brown, 2015;Kline, 2015). Conventional model fit indices, including the comparative fit index (CFI; Bentler, 1990), the root mean squared error of approximation (RMSEA; Steiger & Lind, 1980), and the square root mean residual (SRMR; Hu & Bentler, 1999) were used to evaluate each model. Threshold values of > 0.90 (CFI), < 0.08 (RMSEA) and < 0.06 (SRMR) were set as cut-points to establish model adequacy. ...
Article
Full-text available
Background Maternal birth experience is being increasingly recognised as a key clinical outcome parameter. The Birth Satisfaction Scale-Revised (BSS-R) is a short self-report measure designed to assess birth experience. The current investigation sought to trans-late the BSS-R into Polish and validate this version of the BSS-R (PL-BSS-R). Participants and procedure The BSS-R was translated into Polish by an expert panel using forward and backward translation. A complex within-subjects design with an embedded between-subjects component was used to determine the key psychometric characteristics of the PL-BSS-R. Two hundred ninety-four Polish-speaking women in Poland completed the follow-up component of the study where the PL-BSS-R was administered. The PL-BSS-R measurement properties were examined using confirmatory factor analysis, divergent, convergent validity analysis, internal consistency appraisal and investigation of known-groups discriminant characteristics. Results The PL-BSS-R was found to have generally very good measurement properties and to be equivalent to the original English-language version across key validity indices. The PL-BBS-R was found to be significantly correlated with neonatal physical health immediately postpartum and differed across delivery modes. Conclusions The PL-BSS-R is a psychometrically robust measure of birth experience appropriate for clinical and research use within Po-land. Important associations were noted between subjective maternal birth experience and objective measures of neonatal physical health, indicating a critically important future research direction.
... To fit the models we used the sample correlation matrix using all available data (pairwise correlations; similar results were found when using full information maximum likelihood). For all model testing (using Lisrel 8.80; similar results were found when using R), we report several fit statistics (e.g., Bentler, 1990;Browne & Cudeck, 1992;Schermelleh-Engel et al., 2003). Nonsignificant chi-square tests indicate adequate model fit; with large samples like ours, however, they are nearly always significant. ...
Article
Full-text available
Relations between conative factors (task-specific motivation, attention self-efficacy, and self-set goals) and individual differences in attention control (AC) performance were investigated in two latent variable studies. Participants performed AC tasks along with measures of working memory and processing speed. During the AC tasks, participants self-reported their motivation, self-efficacy, and self-set goals for the tasks. Task-unrelated thoughts were also assessed. Confirmatory factor analyses demonstrated that latent factors for the constructs could be formed and the conative factors were each related to the AC factor. Structural equation modeling further suggested that the conative factors tended to account for unique variance in attention, even after accounting for shared variance with working memory and processing speed. These results provide evidence that conative factors are important for individual differences in AC and further suggest that multiple factors likely contribute to variation in performance on AC tasks.
... General guidelines indicate that the values of χ2/df ratios on the order of 3/1 or less indicate better-fitting models (Hair et al., 2013). CFI ≥ .95 is considered indicative of a good-fitting model (Bentler, 1990;Brown, 2006;Hu & Bentler, 1999;Kline, 2016). SRMR values of .08 or less are desired (with CFI above .92) ...
Article
Full-text available
The research intended to adapt and validate the self-report job precariousness scale for the Brazilian gig work context and to investigate the association of the dimensions of job precariousness with gig workers' subjective experiences and work outcomes. Exploratory and confirmatory factor analyses were conducted on a sample of 504 Brazilian gig workers. In addition, zero-order correlations were performed on a sample of 304 Brazilian gig workers for criterion validity analysis. Results supported a four-factor structure and the bi-factor model, reinforcing the assumption that the job precarious scale is a multidimensional measure with a hierarchical structure. Reliability analysis (Alpha coefficient and bifactor indices) indicates that the scale presented adequate internal consistency for all four dimensions and the full scale. Results regarding criterion validity demonstrate that job precariousness is negatively linked to well-being and positively associated with ill-being; in addition, the dimensions of job precariousness and remuneration have significative associations with all variables of work outcome investigated. This study introduces the Brazilian version of the self-report job precariousness scale with robust psychometric qualities to assess workers' perception of precarious working conditions in the Brazilian gig work context. In addition, it broadens the scope of research on precarious working conditions and their impact on psychological experiences and work outcomes.
... The following fit indices were used to indicate model-data fit: comparative fit index (CFI; [94]), non-normed fit index (NNFI; [95]), root mean square error of approximation (RMSEA; [96]), standardized root mean square residual (SRMR; [97]), and Akaike information criterion (AIC; [98]). The chi-square (χ 2 ) test of model fit was reported, however due to the hypersensitivity of this statistic (e.g., to sample size), significance-level was not used to indicate model fit [34,70]. ...
Article
Full-text available
Compassion towards oneself and towards others has been associated with positive psychological outcomes, however, research is limited by the availability of valid psychometric measures, particularly in languages other than English. The current study translated (English to French) and validated the following measures: the Compassionate Engagement and Action Scales (CEAS), assessing self-compassion (CEAS-SC), compassion to others (CEAS-TO), and compassion from others (CEAS-FROM); the Compassion Scale (CS); and the Sussex-Oxford Compassion Scales for Self (SOCS-S) and Others (SOCS-O). French-speaking participants were recruited online (N = 384) and completed the translated measures as well as questionnaires assessing self-compassion, depression, anxiety, stress, insecure attachment, mindfulness, and well-being. Confirmatory Factor Analysis supports the original factor structures proposed for the CEAS-FROM (two-factor hierarchical), CS (four-factor hierarchical), SOCS-S and SOCS-O (five-factor hierarchical), with alternate factor structures proposed for CEAS-SC (three-factor) and CEAS-TO (two-factor). Results showed good internal consistency and convergent validity for all scales, supporting the use of total scores for the translated measures.
Preprint
Full-text available
Background: The agri-food supply chain is crucial for a nation's sustenance and economic stability but faces challenges such as lack of transparency, inefficiencies, and information asymmetry. Integrating Blockchain Database (BCD) technology, along with Internet of Things (IoT) technologies, offers transformative potential. This combination can enhance the Transparent Physical and Information Flow (PHF), thus improving Transparency in the Agri-food Supply Chain (TASC). Objective: This research examines how integrating BCD affects PHF and, in turn, influences TASC in Bangladesh. It is based on two main hypotheses BCD significantly impacts PHF, and a BCD-enhanced PHF subsequently affects TASC. Methods: An analytical framework was designed to explore the integration of BCD technology and its effect on the transparency of Bangladesh's agri-food supply chain. Data analysis followed five stages: Preliminary Data Examination, Exploratory Factor Analysis (EFA), Confirmatory Factor Analysis (CFA), Structural Equation Modeling (SEM), and Hypothesis Testing, utilizing IBM SPSS and IBM AMOS. Data were gathered from 400 stakeholders in the Bangladesh agri-food supply chain. Results and Conclusion: The findings support both hypotheses, showing a significant and positive impact of BCD technology on PHF and, consequently, on TASC. The results highlight the essential role of BCD in enhancing supply chain transparency and operational efficiency. Implications of the Research: This study offers empirical evidence on how blockchain technology can effectively address transparency and efficiency challenges in the agri-food supply chain. It highlights the potential of BCD to enhance decision-making, operational efficiency, and consumer trust within the agricultural sector, particularly in developing countries such as Bangladesh. Originality/Value: This research provides fresh insights into how BCD technologies can enhance transparency and efficiency in the agri-food supply chain. By concentrating on the context of Bangladesh, it offers significant implications for policymakers, industry professionals, and researchers, highlighting the transformative potential of blockchain in managing agricultural supply chai
Preprint
Full-text available
Declining cognitive and motor functions make safe driving difficult for older adults. Trail Making Test (TMT) scores are reported to facilitate the estimation of cognitive functions in older adults and enable correlations with parameters associated with driving skills and vehicle speed. However, the causal relationships between cognitive functions and discrete driving-related parameters remain unclear. First, this study examined the correlations between the TMT indices and driving-related parameters of older adult drivers. Next, it used structural equation modeling to express the causal relationships between the parameters and TMT indicators. Thirty older adult drivers accomplished an intersection passage task on a driving simulator (DS) and consecutively performed multiple TMT iterations. Vehicle operation data collected from DS logs, data on head motions to confirm safety at an intersection, and accumulated TMT scores indicating cognitive functions enabled this study to determine correlations between the TMT indices and the parameters of pedal operation (vehicle speed), steering (steering input and lateral vehicle position), and head motion (horizontal angle and velocity). Models were then created to discern relationships between these parameters and the cognitive functions of older adult drivers. The study results indicate the possibility of automatically estimating the cognitive functions of older adult drivers from their daily driving-related data.
Chapter
Full-text available
This edited volume offers researchers, educators and students of the social sciences insights into the long-term consequences of the historic COVID-19 crisis. It contains contributions based on the longitudinal study ‘Values in Crisis 2020–2022’, in whose surveys more than 2000 respondents took part, by over 20 authors located at Austrian universities and research institutions. The book illuminates various topics such as value orientations, future expectations and the well-being of Austrians during the pandemic. Empirical insights give an overview of social inequalities during the crisis, changes in the assessment of relevant spheres of life, conspiracy mentality and many other pressing issues in contemporary Austrian society.
Chapter
Full-text available
Asymptotically distribution-free efficient estimates are obtained for a large class of models and estimators, all based on a postulate of the form: V T(s - converges in law to a multivariate normal distribution with s+= u(O) being a function of a set of structural parameters under the null hypothesis. First, we deal with minimum x2 or nonlinear generalized least squares estimation under nonlinear constraints and con- sider the problems of consistency, asymptotic normality and efficiency, bias, and tests of fit and restrictions. Thereafter, we develop the parallel theory for an estimator obtained by linearization of the structural model as well as constraint functions on the parameters. Linearized estimators and tests based on a one-step improvement from an initial consistent estimator are shown to have the same optimal statistical proper- ties as their fully iterated counterparts. The classical psychometric factor analytic model, the econometric simultaneous equation system, and related models provide illustrations of the theory. A number of new estimators and their asymptotic distributions are described. New perspectives on old estimators are also offered.
Article
A simulation study of the effects of sample size on the overall fit statistic provided by the LISREL program indicates the statistic is well behaved over a wide range of sample sizes for simple models. However, this statistic is apparently not chi square distributed for more complex models when samples are relatively small, and will reject the hypothesized model too often. A set of additional measures suggested by various researchers for evaluating causal models also is examined. These statistics are well behaved for both models tested as they converge to the true value and their variance approaches zero as sample size increases.
Article
Assessing overall model fit is an important problem in general structural equation models. One of the most widely used fit measures is Bentler and Bonett's (1980) normed index. This article has three purposes: (1) to propose a new incremental fit measure that provides an adjustment to the normed index for sample size and degrees of freedom, (2) to explain the relation between this new fit measure and the other ones, and (3) to illustrate its properties with an empirical example and a Monte Carlo simulation. The simulation suggests that the mean of the sampling distribution of the new fit measure stays at about one for different sample sizes whereas that for the normed fit index increases with N. In addition, the standard deviation of the new measure is relatively low compared to some other measures (e.g., Tucker and Lewis's (1973) and Bentler and Bonett's (1980) nonnormed index). The empirical example suggests that the new fit measure is relatively stable for the same model in different samples. In sum, it appears that the new incremental measure is a useful complement to the existing fit measures.
Book
2nd ed. of 1981 PhD thesis. Topics: stochastic and numerical convergence properties of Partial Least Squares (PLS), the comparison of LISREL with PLS, and an analysis of GLS for covariance structures under misspecification
Article
In recent years a number of measures have been suggested for the assessment of fit of overidentified models with latent variables (i.e., covariance structure models). This article discusses the logic of the fit problem, reviews the analytical intentions of six of these measures, with emphasis on their dependence on sample size, and compares the operational behavior of these measures in three-model situations: in a confirmatory factor model based on small N, and in two covariance structure models, one based on a slightly larger N and the other based on a large N. Given that these models and data are “typical,” results suggest that certain measures are both more stable across sample sizes and more sensitive to important variation in fit across substantively plausible models. The article concludes by suggesting a three-component approach to fitting: use of multiple measures, strategical overfitting, and comparison of parameter estimates in borderline versus more clearly sufficient models in terms of fit.
Article
In this paper we compare alternative asymptotic approximations to the power of the likelihood ratio test used in covariance structure analysis for testing the fit of a model. Alternative expressions for the noncentrality parameter (ncp) lead to different approximations to the power function. It appears that for alternative covariance matrices close to the null hypothesis, the alternative ncp's lead to similar values, while for alternative covariance matrices far from Ho the different expressions for the ncp can conflict substantively. Monte Carlo evidence shows that the ncp proposed in Satorra and Saris (1985) gives the most accurate power approximations.
Article
In the context of a robust generalized least squares approach, a new statistic for testing the validity of constraints in structural equation models is proposed. It is shown that the new test requires significantly less computational effort than the traditional one. Extension to models in several populations is also considered. Illustrative applications based on real data are given.