ArticlePDF Available

Violation of the assumption of local independence in the Rasch model with covariates

January 2010

January 2010
1:33-39

Authors:

Daniela Fruttini

Università degli Studi di Perugia

Content uploaded by Daniela Fruttini

Content may be subject to copyright.

Violation of the assumption of local independence in

the Rasch model with covariates

Daniela Fruttini

Department of Economics, Finance and Statistics

daniela.fruttini@stat.unipg.it

Keywords: Binary variables; EM algorithm; log-linear models; marginal models.

1. Introduction

In many contexts, an individual characteristic which is not directly observable is

measured by the responses to a set of items. For example, in geriatrics, the quality-of-

life of elderly people is measured by responses to items which concern many aspects of

the daily living (WHO, 1980), such as "Without aid, have you any difficulty in

washing? or eating?". Responses to these items should discriminate certain categories of

individuals corresponding to different levels of disability. Similar approaches are

adopted in psychology, assessment of learning and in the social sciences.

Several methods are available for the analysis of data deriving from the administration

of a questionnaire made of items of the type above. Lord was the first to propose a

model which is tailored to the analysis of such data, see Lord (1952), and then he is

considered the founder of the Items Response Theory (IRT). Many other models have

been proposed starting from the basic approach of Lord; for a review see, among others,

Hambleton and Swaminathan (1991). Among these models, the one developed by Rasch

(1960) is still the most popular. One of the basic assumptions of this and many other

IRT models is that of local independence (LI): the responses provided by the same

subject to different items are conditionally independent given the latent characteristic.

In many cases, the LI assumption may be too restrictive, since the response variables are

correlated even conditionally on the latent characteristic they measure. This typically

happens when individuals respond the same items at different occasions or when there

is a learning-through-training effect of an item on the other items. In these situations,

the LI assumption must be relaxed in a suitable way in order to avoid distortion in

estimating the parameters of interest.

The present work is a summary of the thesis developed within the Doctorate program in

“Mathematical and Statistical Methods for the Economic and Social Sciences”, at the

University of the Perugia; aim of the thesis is showing how the LI assumption may be

relaxed in the Rasch model with covariates. Two main approaches, based on log-linear

and marginal models, are proposed. The first one has the advantage of being easier to

implement, but it gives rise to an IRT model with parameters which are more difficult to

interpret. The marginal approach is more complicate to implement, but it gives rise to a

more interpretable model. The work especially deals with the efficient implementation

of the estimation algorithms for the resulting models. These algorithms are used for the

analysis of real data. In particular, the approaches are illustrated by the analysis of data

concerning tests for the evaluation of knowledge of the Italian language (CELI) made

available by the University for Foreigners in Perugia. In a second application, data are

analyzed which come from a longitudinal investigation (InChianti) of a population of

elderly subjects living in the area of Chianti, Tuscany. This survey aims to assess the

possible causes of psyco-physical deterioration of individuals due to ageing.

2. Extensions of the Rasch model with covariates

Suppose that a questionnaire of J items is administered to a group of n subjects. The

binary response variable for the answer of individual i to item j is denoted by ij

y and

)',,( 1iJii yy K

y is the response vector of the same subject. One of the basic

assumptions of the Rasch model with covariates is that the responses to the items

depend on sinlge a latent trait, which is typically interpreted as ability, in the following

way:

)'exp(1

)'exp(

),1(

jiji

ijiijij ypp βx

βx

x++

===

where i

is the ability of subject i, j

β is a vector of parameters for item j (which

includes the difficulty level) and ij

x is the corresponding vector covariates having a

suitable structure. As already said, the Rasch model also assumes LI, i.e. the response

variables in i

y are conditionally independent given i

The parameters of the model above can be estimated by using three different methods:

Joint Maximum Likelihood, Conditional Maximum Likelihood and Marginal

Maximum Likelihood (MML). The first method is easy to use, but it leads to an

unconsistent estimator of the structural parameters in j

β because the overall number of

parameters grows with the sample size. The second method is based on conditioning on

sufficient statistics for the ability parameters and leads to a consistent estimator of the

structural parameters, whereas the third is based on the assumption that subjects are

extracted from a population in which the ability has a continuous distribution, such as

the normal one, or a discrete distribution, so that a latent class approach results.

2.1 Relaxing the assumption of local independence in the Rasch model

with covariates: log-linear approach

We assume a log-linear model with two-way interactions for the conditional distribution

of the response variables given the ability level and the covariates. The model

parameterization may be then expressed as

)]()(exp['

)]()(exp[

),(

γXWZ1

γXWZ

Xp =,

where the vector ),( ii Xp

contains the conditional probability of every possible

configuration of the response vector i

y, W is a design matrix common to all subjects,

)( i

XZ is an additional design matrix that incorporates the individual covariates

collected in i

X and )( i

γ is a vector of parameters which includes i

. In particular, W

has dimension ]2/)1([2 −+× JJJ

J, since the log-linear effects that are used here are J

main effects and 2/)1(

−

JJ two-way interactions. Let )( ji

y be the subvector of all

response variables apart from the j-th, let )( 21jji

y be the subvector in which the

j-th

and

j-th response variables are excluded, and let 0 denotes a vector of zeros of

suitable dimension. Main effects and two-way interactions correspond, respectively, to

the conditional logits and conditional log-odds ratios:

),,0(

),,1(

log

)(

0yX

jiiiij

),,1,0(),,0,1(

),,0,0(),,1,1(

log

)()(

21212121

0yX0yX

======

jjiiiijijjjiiiijij

yypyyp

θθ

2.2 Relaxing the assumption of local independence in the Rasch model

with covariates: marginal approach

Marginal models for contingency tables were conceived with the aim of overcoming

some limitations of log-linear models. In fact, the so-called main effects of a log-linear

model are not parameters that describe the univariate marginal distributions to which

they are referred, but they describe the corresponding conditional distributions given

the remaining variables. On the other hand, marginal models allow us to directly

describe the marginal distributions of interest. We follow this approach; the resulting

model is then based on marginal logits

),0(

),1(

log

iiij

and marginal log-odds ratios

),1,0(),0,1(

),0,0(),1,1(

log

2121

iiijijiiijij

yypyyp

θθ

====

It has to be clear that these are marginal effects with respect to the other response

variables, but they are conditioned on the latent trait and the covariates. It is worth

noting that the parameterization of the model may be simply expressed as

)],(log[ ii XMpCφ

, where M is the marginalization matrix with elements 0 and 1

and C is the matrix of contrasts with elements

−

and

. The vector

consists of

−

J effects, which are indicated by

for every subset

of },,1{ JK. These

effects correspond to the marginal logits and log-odds ratios above when }{ jT

and

},{

jjT

, respectively.

3. Estimation and Applications

Estimation of the parameters of the proposed models is carried out by the MML

method. At this aim, an efficient implementation of the Expectation-Maximization

algorithm of Dempster et al. (1977) is proposed, which allows us to estimate the

parameters even with a large number of items. As usual, the resulting algorithm is based

on alternating two steps until convergence:

• E-step: consists of computing the conditional expected value of the log-

likelihood of the complete data, which are correspond to the response

configuration and the latent variable level for every subject in the sample;

• The M-step consists of updating the model parameters by maximizing the

expected value computed above.

In order to illustrate the proposed approaches, data are analyzed which concern tests to

assess the knowledge of Italian language (CELI). These tests are organized on the basis

of the four skills: reading, writing, comprehension, and conversation. Such certification

comprises five difficulty levels, from CELI1 to CELI5. The CELI3 data are used in the

application. The response variables considered here refer to comprehension and

correspond to 9 items administered to 2681 students. Some covariates are considered,

such as nationality, age, gender and center where the questionnaire was administered.

The other application is focused on data from the InChianti study. This a longitudinal

study, although data referred to only the first two occasions are available. The response

variables derive from the Mini-Mental State Examination, which is a screening test

designed to detect cognitive deterioration, assess the severity of dementia and evaluate

their changes over time. For the application, 7 items are considered which measure a

reduced set of cognitive functions.

References

Agresti A. (1990). Categorical Data Analysis, Wiley, New York.

Bartolucci F. and Forcina A. (2000). A likelihood ratio test for MTP2 within binary

variables, The Annals of Statistics, 28, 1206-1218.

Bartolucci F. and Forcina A. (2002). Extended RC association models allowing for

order restriction and marginal modeling, Journal of American Statistical Association,

97, 1192-1199.

Bartolucci F., Forcina A. and Dardanoni V. (2001). Positive quadrant dependence and

marginal modelling in two-way tables with ordered marginals, Journal of American

Statistical Association, 96, 1497-1505.

Bartolucci F., Colombi R. and Forcina A. (2007). An extended class of marginal link

function for modelling contingency tables by equality and inequality constraints,

Statistica Sinica, 17, 691-711.

Dempster A. P., Laird N. M. and Rubin D. B.(1977), Maximum likelihood for

incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series

B, 39, 1–38.

Forcina A. and Bartolucci F. (2000). Modelling quality of life variables with non-

parametric mixtures, Envirometrics, 15, 1-10.

Glonek G. F. V. and McCullagh P. (1995). Multivariate logistic models, Journal Royal

Statistical Society, Series B, 57, 533–546.

Goodman L. A. (1974). Exploratory latent structure analysis using both identifiable and

unidentifiable models, Biometrika, 61, 215–231.

Haberman S. J.(1977). Maximum Likelihood Estimates in Exponential Response

Models, The Annals of Statistics, 5, 815-841.

Hambleton R. K. and H. Swaminathan, (1991), Item response theory principles and

applications, Kluwer Nijhoff Publishing Boston.

Lord, F. (1952). A theory of test scores, Psychometric Monographs, 7.

McCullagh P. and Nelder J. A.(1989). Generalized Linear Models, Chapmann and Hall,

London.

Rasch G. (1960). Probabilistic models for some intelligence and attainment tests,

Danish Institute for Educational Research, Copenhagen.

ResearchGate has not been able to resolve any citations for this publication.

Multivariate Logistic Models

Article

Sep 1995

When data composed of several categorical responses together with categorical or continuous predictors are observed, the multivariate logistic transform introduced by McCullagh and Nelder can be used to define a class of regression models that is, in many applications, particularly suitable for relating the joint distribution of the responses to predictors. In this paper we give a general definition of this class of models and study their properties. A computational scheme for performing maximum likelihood estimation for data sets of moderate size is described and a system of model formulae that succinctly define particular models is introduced. Applications of these models to longitudinal problems are illustrated by numerical examples.

Exploratory latent structure analysis using both identifiable and unidentifiable models

Article

Jan 1974

L.A. Goodman

This paper considers a wide class of latent structure models. These models can serve as possible explanations of the observed relationships among a set of m manifest polytomous variables. The class of models considered here includes both models in which the parameters are identifiable and also models in which the parameters are not. For each of the models considered here, a relatively simple method is presented for calculating the maximum likelihood estimate of the frequencies in the m way contingency table expected under the model, and for determining whether the parameters in the estimated model are identifiable. In addition, methods are presented for testing whether the model fits the observed data, and for replacing unidentifiable models that fit by identifiable models that fit. Some illustrative applications to data are also included.

Item Response Theory: Principles and Applications

Book

Jan 1985

Generalized Linear Models

Article

Dec 1985

Categorical Data Analysis 2nd Edn

Book

Jan 1990

Agresti

Generalized Linear Model

Book

Jan 1989

Multivariate Logistic Models

Article

Jan 1995

Item Response Theory: Principles and Applications-Kluwer Nijhoff Publishing

Article

Categorical Data Analysis (2nd Edition)

Book

Jan 2002
J OPER RES SOC

A Agresti

Explanatory Latent Structure Analysis Using Both Identifiable and Unidentifiable Models

Article

Aug 1974
BIOMETRIKA

Leo A Goodman

This paper considers a wide class of latent structure models. These models can serve as possible explanations of the observed relationships among a set of m manifest polytomous variables. The class of models considered here includes both models in which the parameters are identifiable and also models in which the parameters are not. For each of the models considered here, a relatively simple method is presented for calculating the maximum likelihood estimate of the frequencies in the m-way contingency table expected under the model, and for determining whether the parameters in the estimated model are identifiable. In addition, methods are presented for testing whether the model fits the observed data, and for replacing unidentifiable models that fit by identifiable models that fit. Some illustrative applications to data are also included.

Violation of the assumption of local independence in the Rasch model with covariates

Recommended publications

Violation of the assumption of local indipendence in the Rasch model with covariates

Regression models for multivariate ordered responses via the Plackett distribution

Identifiability of extended latent class models with individual covariates

A Multivariate Extension of the Dynamic Logit Model for Longitudinal Data Based on a Latent Markov H...