ArticlePDF Available

Violation of the assumption of local independence in the Rasch model with covariates

Authors:
Violation of the assumption of local independence in
the Rasch model with covariates
Daniela Fruttini
Department of Economics, Finance and Statistics
daniela.fruttini@stat.unipg.it
Keywords: Binary variables; EM algorithm; log-linear models; marginal models.
1. Introduction
In many contexts, an individual characteristic which is not directly observable is
measured by the responses to a set of items. For example, in geriatrics, the quality-of-
life of elderly people is measured by responses to items which concern many aspects of
the daily living (WHO, 1980), such as "Without aid, have you any difficulty in
washing? or eating?". Responses to these items should discriminate certain categories of
individuals corresponding to different levels of disability. Similar approaches are
adopted in psychology, assessment of learning and in the social sciences.
Several methods are available for the analysis of data deriving from the administration
of a questionnaire made of items of the type above. Lord was the first to propose a
model which is tailored to the analysis of such data, see Lord (1952), and then he is
considered the founder of the Items Response Theory (IRT). Many other models have
been proposed starting from the basic approach of Lord; for a review see, among others,
Hambleton and Swaminathan (1991). Among these models, the one developed by Rasch
(1960) is still the most popular. One of the basic assumptions of this and many other
IRT models is that of local independence (LI): the responses provided by the same
subject to different items are conditionally independent given the latent characteristic.
In many cases, the LI assumption may be too restrictive, since the response variables are
correlated even conditionally on the latent characteristic they measure. This typically
happens when individuals respond the same items at different occasions or when there
is a learning-through-training effect of an item on the other items. In these situations,
the LI assumption must be relaxed in a suitable way in order to avoid distortion in
estimating the parameters of interest.
The present work is a summary of the thesis developed within the Doctorate program in
“Mathematical and Statistical Methods for the Economic and Social Sciences”, at the
University of the Perugia; aim of the thesis is showing how the LI assumption may be
relaxed in the Rasch model with covariates. Two main approaches, based on log-linear
and marginal models, are proposed. The first one has the advantage of being easier to
implement, but it gives rise to an IRT model with parameters which are more difficult to
interpret. The marginal approach is more complicate to implement, but it gives rise to a
more interpretable model. The work especially deals with the efficient implementation
of the estimation algorithms for the resulting models. These algorithms are used for the
analysis of real data. In particular, the approaches are illustrated by the analysis of data
concerning tests for the evaluation of knowledge of the Italian language (CELI) made
available by the University for Foreigners in Perugia. In a second application, data are
analyzed which come from a longitudinal investigation (InChianti) of a population of
elderly subjects living in the area of Chianti, Tuscany. This survey aims to assess the
possible causes of psyco-physical deterioration of individuals due to ageing.
2. Extensions of the Rasch model with covariates
Suppose that a questionnaire of J items is administered to a group of n subjects. The
binary response variable for the answer of individual i to item j is denoted by ij
y and
)',,( 1iJii yy K
=
y is the response vector of the same subject. One of the basic
assumptions of the Rasch model with covariates is that the responses to the items
depend on sinlge a latent trait, which is typically interpreted as ability, in the following
way:
)'exp(1
)'exp(
),1(
jiji
jiji
ijiijij ypp βx
βx
x++
+
===
θ
θ
θ
,
where i
θ
is the ability of subject i, j
β is a vector of parameters for item j (which
includes the difficulty level) and ij
x is the corresponding vector covariates having a
suitable structure. As already said, the Rasch model also assumes LI, i.e. the response
variables in i
y are conditionally independent given i
θ
.
The parameters of the model above can be estimated by using three different methods:
Joint Maximum Likelihood, Conditional Maximum Likelihood and Marginal
Maximum Likelihood (MML). The first method is easy to use, but it leads to an
unconsistent estimator of the structural parameters in j
β because the overall number of
parameters grows with the sample size. The second method is based on conditioning on
sufficient statistics for the ability parameters and leads to a consistent estimator of the
structural parameters, whereas the third is based on the assumption that subjects are
extracted from a population in which the ability has a continuous distribution, such as
the normal one, or a discrete distribution, so that a latent class approach results.
2.1 Relaxing the assumption of local independence in the Rasch model
with covariates: log-linear approach
We assume a log-linear model with two-way interactions for the conditional distribution
of the response variables given the ability level and the covariates. The model
parameterization may be then expressed as
)]()(exp['
)]()(exp[
),(
ii
ii
ii
θ
θ
θ
γXWZ1
γXWZ
Xp =,
where the vector ),( ii Xp
θ
contains the conditional probability of every possible
configuration of the response vector i
y, W is a design matrix common to all subjects,
)( i
XZ is an additional design matrix that incorporates the individual covariates
collected in i
X and )( i
θ
γ is a vector of parameters which includes i
θ
. In particular, W
has dimension ]2/)1([2 +× JJJ
J, since the log-linear effects that are used here are J
main effects and 2/)1(
JJ two-way interactions. Let )( ji
y be the subvector of all
response variables apart from the j-th, let )( 21jji
y be the subvector in which the
1
j-th
and
2
j-th response variables are excluded, and let 0 denotes a vector of zeros of
suitable dimension. Main effects and two-way interactions correspond, respectively, to
the conditional logits and conditional log-odds ratios:
),,0(
),,1(
log
)(
)(
0yX
0yX
==
==
jiiiij
jiiiij
yp
yp
θ
θ
,
),,1,0(),,0,1(
),,0,0(),,1,1(
log
)()(
)()(
21212121
21212121
0yX0yX
0yX0yX
======
======
jjiiiijijjjiiiijij
jjiiiijijjjiiiijij
yypyyp
yypyyp
θθ
θθ
.
2.2 Relaxing the assumption of local independence in the Rasch model
with covariates: marginal approach
Marginal models for contingency tables were conceived with the aim of overcoming
some limitations of log-linear models. In fact, the so-called main effects of a log-linear
model are not parameters that describe the univariate marginal distributions to which
they are referred, but they describe the corresponding conditional distributions given
the remaining variables. On the other hand, marginal models allow us to directly
describe the marginal distributions of interest. We follow this approach; the resulting
model is then based on marginal logits
),0(
),1(
log
iiij
iiij
yp
yp
X
X
θ
θ
=
=,
and marginal log-odds ratios
),1,0(),0,1(
),0,0(),1,1(
log
2121
2121
iiijijiiijij
iiijijiiijij
yypyyp
yypyyp
XX
XX
θθ
θθ
====
====
.
It has to be clear that these are marginal effects with respect to the other response
variables, but they are conditioned on the latent trait and the covariates. It is worth
noting that the parameterization of the model may be simply expressed as
)],(log[ ii XMpCφ
θ
=
, where M is the marginalization matrix with elements 0 and 1
and C is the matrix of contrasts with elements
1
and
1
. The vector
φ
consists of
1
2
J effects, which are indicated by
T
ϕ
for every subset
T
of },,1{ JK. These
effects correspond to the marginal logits and log-odds ratios above when }{ jT
=
and
},{
jjT
=
, respectively.
3. Estimation and Applications
Estimation of the parameters of the proposed models is carried out by the MML
method. At this aim, an efficient implementation of the Expectation-Maximization
algorithm of Dempster et al. (1977) is proposed, which allows us to estimate the
parameters even with a large number of items. As usual, the resulting algorithm is based
on alternating two steps until convergence:
E-step: consists of computing the conditional expected value of the log-
likelihood of the complete data, which are correspond to the response
configuration and the latent variable level for every subject in the sample;
The M-step consists of updating the model parameters by maximizing the
expected value computed above.
In order to illustrate the proposed approaches, data are analyzed which concern tests to
assess the knowledge of Italian language (CELI). These tests are organized on the basis
of the four skills: reading, writing, comprehension, and conversation. Such certification
comprises five difficulty levels, from CELI1 to CELI5. The CELI3 data are used in the
application. The response variables considered here refer to comprehension and
correspond to 9 items administered to 2681 students. Some covariates are considered,
such as nationality, age, gender and center where the questionnaire was administered.
The other application is focused on data from the InChianti study. This a longitudinal
study, although data referred to only the first two occasions are available. The response
variables derive from the Mini-Mental State Examination, which is a screening test
designed to detect cognitive deterioration, assess the severity of dementia and evaluate
their changes over time. For the application, 7 items are considered which measure a
reduced set of cognitive functions.
References
Agresti A. (1990). Categorical Data Analysis, Wiley, New York.
Bartolucci F. and Forcina A. (2000). A likelihood ratio test for MTP2 within binary
variables, The Annals of Statistics, 28, 1206-1218.
Bartolucci F. and Forcina A. (2002). Extended RC association models allowing for
order restriction and marginal modeling, Journal of American Statistical Association,
97, 1192-1199.
Bartolucci F., Forcina A. and Dardanoni V. (2001). Positive quadrant dependence and
marginal modelling in two-way tables with ordered marginals, Journal of American
Statistical Association, 96, 1497-1505.
Bartolucci F., Colombi R. and Forcina A. (2007). An extended class of marginal link
function for modelling contingency tables by equality and inequality constraints,
Statistica Sinica, 17, 691-711.
Dempster A. P., Laird N. M. and Rubin D. B.(1977), Maximum likelihood for
incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series
B, 39, 1–38.
Forcina A. and Bartolucci F. (2000). Modelling quality of life variables with non-
parametric mixtures, Envirometrics, 15, 1-10.
Glonek G. F. V. and McCullagh P. (1995). Multivariate logistic models, Journal Royal
Statistical Society, Series B, 57, 533–546.
Goodman L. A. (1974). Exploratory latent structure analysis using both identifiable and
unidentifiable models, Biometrika, 61, 215–231.
Haberman S. J.(1977). Maximum Likelihood Estimates in Exponential Response
Models, The Annals of Statistics, 5, 815-841.
Hambleton R. K. and H. Swaminathan, (1991), Item response theory principles and
applications, Kluwer Nijhoff Publishing Boston.
Lord, F. (1952). A theory of test scores, Psychometric Monographs, 7.
McCullagh P. and Nelder J. A.(1989). Generalized Linear Models, Chapmann and Hall,
London.
Rasch G. (1960). Probabilistic models for some intelligence and attainment tests,
Danish Institute for Educational Research, Copenhagen.
ResearchGate has not been able to resolve any citations for this publication.
Article
When data composed of several categorical responses together with categorical or continuous predictors are observed, the multivariate logistic transform introduced by McCullagh and Nelder can be used to define a class of regression models that is, in many applications, particularly suitable for relating the joint distribution of the responses to predictors. In this paper we give a general definition of this class of models and study their properties. A computational scheme for performing maximum likelihood estimation for data sets of moderate size is described and a system of model formulae that succinctly define particular models is introduced. Applications of these models to longitudinal problems are illustrated by numerical examples.
Article
This paper considers a wide class of latent structure models. These models can serve as possible explanations of the observed relationships among a set of m manifest polytomous variables. The class of models considered here includes both models in which the parameters are identifiable and also models in which the parameters are not. For each of the models considered here, a relatively simple method is presented for calculating the maximum likelihood estimate of the frequencies in the m way contingency table expected under the model, and for determining whether the parameters in the estimated model are identifiable. In addition, methods are presented for testing whether the model fits the observed data, and for replacing unidentifiable models that fit by identifiable models that fit. Some illustrative applications to data are also included.
Article
When data composed of several categorical responses together with categorical or continuous predictors are observed, the multivariate logistic transform introduced by McCullagh and Nelder can be used to define a class of regression models that is, in many applications, particularly suitable for relating the joint distribution of the responses to predictors. In this paper we give a general definition of this class of models and study their properties. A computational scheme for performing maximum likelihood estimation for data sets of moderate size is described and a system of model formulae that succinctly define particular models is introduced. Applications of these models to longitudinal problems are illustrated by numerical examples.
Article
This paper considers a wide class of latent structure models. These models can serve as possible explanations of the observed relationships among a set of m manifest polytomous variables. The class of models considered here includes both models in which the parameters are identifiable and also models in which the parameters are not. For each of the models considered here, a relatively simple method is presented for calculating the maximum likelihood estimate of the frequencies in the m-way contingency table expected under the model, and for determining whether the parameters in the estimated model are identifiable. In addition, methods are presented for testing whether the model fits the observed data, and for replacing unidentifiable models that fit by identifiable models that fit. Some illustrative applications to data are also included.