ArticlePDF Available

Application of the Health Assessment Questionnaire disability index to various rheumatic diseases

November 2010
Quality of Life Research 19(9):1255-63

November 2010
19(9):1255-63

DOI:10.1007/s11136-010-9690-9

Source
PubMed

License
CC BY-NC 4.0

Authors:

Maaike M. Van Groen

Cito Arnhem

Peter Meindert ten Klooster

University of Twente

Erik Taal

University of Twente

Mart A F J van de Laar

University of Twente

Show all 5 authorsHide

To investigate whether the Stanford Health Assessment Questionnaire Disability Index (HAQ-DI) can serve as a generic instrument for measuring disability across different rheumatic diseases and to propose a scoring method based on item response theory (IRT) modeling to support this goal. The HAQ-DI was administered to a cross-sectional sample of patients with confirmed rheumatoid arthritis (n = 619), osteoarthritis (n = 125), or gout (n = 102). The results were analyzed using the generalized partial credit model as an IRT model. It was found that 4 out of 8 item categories of the HAQ-DI displayed substantial differential item functioning (DIF) over the three diseases. Further, it was shown that this DIF could be modeled using an IRT model with disease-specific item parameters, which produces measures that are comparable for the three diseases. Although the HAQ-DI partially functioned differently in the three disease groups, the measurement regarding the disability level of the patients can be made comparable using IRT methods.

An illustration of the item difficulty locations (average of the 3 category intersection parameters) of the Health Assessment Questionnaire Disability Index on the IRT latent scale in patients with osteoarthritis and gout

…

Outcomes of tests for DIF

…

tem parameters after modeling DIF

…

No caption available

…

Figures - uploaded by Peter Meindert ten Klooster

Content may be subject to copyright.

Content uploaded by Peter Meindert ten Klooster

Content may be subject to copyright.

Available via license: CC BY-NC 4.0

Content may be subject to copyright.

Application of the health assessment questionnaire disability index

to various rheumatic diseases

Maaike M. van Groen •Peter M. ten Klooster •

Erik Taal •Mart A. F. J. van de Laar •

Cees A. W. Glas

Accepted: 6 June 2010 / Published online: 18 June 2010

The Author(s) 2010. This article is published with open access at Springerlink.com

Abstract

Purpose To investigate whether the Stanford Health

Assessment Questionnaire Disability Index (HAQ-DI) can

serve as a generic instrument for measuring disability

across different rheumatic diseases and to propose a scor-

ing method based on item response theory (IRT) modeling

to support this goal.

Methods The HAQ-DI was administered to a cross-

sectional sample of patients with conﬁrmed rheumatoid

arthritis (n=619), osteoarthritis (n=125), or gout (n=

102). The results were analyzed using the generalized

partial credit model as an IRT model.

Results It was found that 4 out of 8 item categories of the

HAQ-DI displayed substantial differential item functioning

(DIF) over the three diseases. Further, it was shown that

this DIF could be modeled using an IRT model with dis-

ease-speciﬁc item parameters, which produces measures

that are comparable for the three diseases.

Conclusion Although the HAQ-DI partially functioned

differently in the three disease groups, the measurement

regarding the disability level of the patients can be made

comparable using IRT methods.

Keywords Rheumatoid arthritis Osteoarthritis 

Gout Health-related quality of life Item response theory 

Differential item functioning

Abbreviations

DIF Differential item functioning

HAQ-DI Health Assessment Questionnaire Disability

Index

IRT Item response theory

LM Lagrange multiplier

OA Osteoarthritis

PF Physical functioning scale

PsA Psoriatic arthritis

RA Rheumatoid arthritis

SF-36 Medical Outcomes Study 36-Item Short Form

Introduction

Besides the traditional use of physical and biochemical

measures, patient-centered outcomes have become more and

more important as outcome measures of interventions [1].

For example, patient-reported disability has become a

standard outcome in the clinical studies of rheumatic dis-

eases. One of the most widely used self-reported measures of

physical disability is the Stanford Health Assessment

Questionnaire Disability Index (HAQ-DI) [2]. Although

often referred to as a disease-speciﬁc measure, it assesses

physical disability in general and does not focus on speciﬁc

disease-associated impairments. In fact, according to its

developers, it was originally intended for use in multiple

illnesses so that the impact of different disease processes

could be compared [1,3]. As a result, the scale has been used

across a wide range of general and clinical populations.

M. M. van Groen P. M. ten Klooster (&)E. Taal 

M. A. F. J. van de Laar C. A. W. Glas

Institute for Behavioral Research, Faculty of Behavioral

Sciences, University of Twente, PO Box 217, 7500 AE

Enschede, The Netherlands

e-mail: P.M.tenKlooster@utwente.nl

M. A. F. J. van de Laar

Department of Rheumatology, Medisch Spectrum Twente,

PO Box 50.000, 7500 KA Enschede, The Netherlands

123

Qual Life Res (2010) 19:1255–1263

DOI 10.1007/s11136-010-9690-9

Especially in the ﬁeld of rheumatology, the HAQ-DI has

become the measure of choice for assessing physical dis-

ability in several speciﬁc rheumatic diseases. Although

physical disability is common among all musculoskeletal

conditions, rheumatic diseases can vary widely in their

underlying disease mechanisms, clinical manifestations,

progress and severity, and composition of the populations

generally affected. All of which may inﬂuence the mea-

surement characteristics and resulting disability scores

across diseases. Nonetheless, mean HAQ-DI scores are

frequently used to directly compare the severity of dis-

ability across different rheumatic diseases, whether or not

adjusted for some known covariates [4–9]. The purpose of

the current study is to investigate whether the HAQ-DI is a

generic instrument indeed, and if this proves problematic,

to model response behavior on disease-speciﬁc items of the

instrument in such a way that the measurement results are

comparable over different groups of rheumatic patients.

The construct validity of the HAQ-DI has been previ-

ously established in numerous studies [1], mostly using

classical psychometric techniques such as factor analysis.

Cole et al. (2005, 2006), for instance, show that there is

considerable support for a single-factor structure and for

comparability of scores of patients with systemic sclerosis

and patients with rheumatoid arthritis. However, some of

the results of these analyses, such as the presence of cor-

related residuals, invite further attention. In the present

article, construct validity is investigated using a unidi-

mensional item response theory (IRT) model. The relation

between IRT modeling and factor-analytic approaches will

be returned to the discussion section.

In IRT models, observed responses are related to a

unidimensional latent trait, that is, to some underlying

scale. The unidimensional latent scale of the HAQ-DI

pertains to the disability level of the patients. The observed

responses are explained by the persons’ disability param-

eters and by item parameters related to the probability that

a person with a certain disability parameter endorses an

item. One of the common assumptions of IRT is mea-

surement invariance, that is, the latent scale applies to all

respondents from some population and items have the

same measurement characteristics, that is, the same item

parameters, for these respondents. A violation of these two

assumptions is known as differential item functioning

(DIF). An item shows DIF if the probability of responding

in the different categories of the item varies across groups

of patients with the same disability level [10,11]. In other

words, an item is biased if the observed item score, con-

ditional on the latent disability level of the patients, differs

between subgroups [12]. In the current study, the construct

validity of the HAQ-DI is investigated by assessing DIF for

patients with three different types of arthritis.

DIF is often investigated using the generalized partial

credit model as an IRT model [11]. The generalized partial

credit model [13] applies to polytomously scored items,

such as the items of the HAQ-DI. The probability of a score

in category xof item iis given by the item response curve

PðXni ¼xjhÞ¼

exp P

j¼1

aihndij



1þP

r¼1

exp P

j¼1

aihndij



;

where h

is the latent disability level of patient n. In the

model, m

denotes the number of item categories. Further,

and a

are item parameters. d

is a category intersection

parameter, that is, it is the point in which the probability of

responding in category j-1is equal to the probability of

responding category j. Finally, a

is a discrimination

parameter that indicates the extent to which the item

response is related to the latent scale. This discrimination

parameter is comparable to a factor loading in a factor

analysis model.

If DIF is not present, this is unambiguous support for the

construct validity of the instrument. If DIF is present,

however, the type of DIF becomes important. As previ-

ously noted, measurement invariance pertains to the pres-

ence of the same latent variable in all subgroups and

constancy of item parameters over subgroups. If only the

latter assumption is violated by a limited number of items,

comparability can often still be realized and construct

validity may still be defendable. For example, a question

regarding the number of cars in the household may be a

good item for measuring the latent variable Wealth, though

the metric in downtown New York and in Texas may be

quite different. In IRT, such differences can be modeled by

group-speciﬁc item parameters. This approach is, of

course, only defendable if it can be explicitly shown that

the responses to the items given in the two groups pertain

to the same latent variable, that is, that it can be shown that

the same IRT model holds for the entire set of response

data. This approach to modeling DIF, which has a con-

siderable tradition in educational measurement [14–17] and

in consumer research [18], will also be applied in the

present study.

Patients and methods

Respondents for this study were recruited during several

waves of data collection in the period between 2005 and

2008 at the outpatient rheumatology clinic of the Medisch

Spectrum Twente hospital in Enschede, the Netherlands.

During data collection days, consecutive patients visiting

1256 Qual Life Res (2010) 19:1255–1263

123

the outpatient clinic were asked to participate. As the study

did not interfere with usual treatment, ethical approval was

not required according to national legislation and local

institutional policy.

In total, 846 patients with physician-conﬁrmed rheu-

matoid arthritis (RA), osteoarthritis (OA), or gout agreed to

participate. Of the included patients, 619 patients were

treated for RA, 125 for OA, and 102 for gout. Table 1gives

a number of characteristics of the sample. The majority of

the patients were women, but as would be expected, the

gout sample consisted of only 18% women. Mean age was

62 with a standard deviation of 13.6 years. A validated

Dutch version of the HAQ was used [19]. The average

scores on the Medical Outcomes Study 36-Item Short Form

(SF-36) health survey [20] were reasonably comparable

across the three conditions. HAQ-DI scores were similar

for patients with RA and OA, whereas patients with gout

reported substantially less disability.

Scoring the HAQ-DI

The Health Assessment Questionnaire Disability Index

(HAQ-DI) consists of 20 questions regarding the limita-

tions patients experience in performing daily physical

activities [2]. Patients are asked how difﬁcult it is to per-

form an activity on a scale of 0 (without any difﬁculty) to 3

(unable to do). Patients are also asked whether they need

assistance or aids for the activity.

The questions of the HAQ-DI are ordered into eight

categories of daily living, covering Rising, Walking,

Dressing and grooming, Reach, Eating, Grip, Activities,

and Hygiene. The highest item score within a category is

used as the score for this category, essentially reducing the

HAQ-DI to an 8-item scale. If a respondent indicates the

use of assistance or aids for a category and his highest item

score within the category is 0 or 1, the category score is

raised to the value 2. The scores on the categories are

averaged to construct a single total score.

Statistical analysis

The scores on the eight categories of the HAQ-DI were

used as input for the statistical analysis. The item param-

eters and the means and variances of the latent person

parameters were estimated by marginal maximum likeli-

hood, and DIF was examined using Lagrange multiplier

(LM) statistics [21]. To compute these statistics, the sam-

ple of respondents is divided into subgroups labeled g=

1,…,G. For the present application, these are the three

disease groups, that is, G=3. The statistic is based on the

difference between average observed scores on every item i

in the subgroups, that is, Sig ¼1

NgPNg

njgXni (where the

summation is over the N

respondents in subgroup g), and

their expectations E(S

). The differences are squared and

divided by their covariance matrix (for the exact expres-

sions, refer to Glas [15]). The hypothesis thus tested is

equivalent to testing the hypothesis that the parameters of

the items are equal for the subgroups. The LM statistic has

an asymptotic chi-square distribution with G-1degrees

of freedom. Below, the statistics will be accompanied by

effect sizes dig ¼maxgSig EðSigÞ

, which show the

seriousness of the model violation. Since the effect sizes d

are on a scale ranging from 0 to the maximum score m

effect sizes d

\0.10 can be considered indicative of

minor, acceptable model violation. It can be noted that this

cut-off point is somewhat arbitrary, but its effectiveness

can be evaluated from whether enough DIF items are

detected and modeled to obtain a ﬁtting overall model.

When items with DIF are identiﬁed, the next step is

trying to model the DIF in such a way that the measures

Table 1 Sample characteristics

RA Rheumatoid arthritis, OA

Osteoarthritis, HAQ-DI Health

Assessment Questionnaire

Disability Index, SF-36 Medical

Outcomes Study 36-Item Short

Form (version 2), PCS Physical

component summary, MCS

Mental component summary

Characteristics Total sample RA sample OA sample Gout sample

N846 619 125 102

Gender (%)

Female 64 69 79 18

Male 36 31 21 82

Age (years)

Mean (SD) 62 (13.6) 62 (14.2) 63 (11.5) 62 (12.6)

Disease duration (years)

Mean (SD) 13 (12.8) 13 (12.4) 14 (13.8) 10 (13.3)

HAQ-DI (range 0–3)

Mean (SD) 0.82 (0.7) 0.97 (0.7) 1.00 (0.65) 0.54 (0.67)

SF-36 PCS (range 0–100)

Mean (SD) 40 (8.4) 39 (8.1) 38 (8.3) 43 (9.2)

SF-36 MCS (range 0–100)

Mean (SD) 39 (7.0) 40 (6.8) 39 (7.3) 38 (7.0)

Qual Life Res (2010) 19:1255–1263 1257

123

obtained in the subgroups are still comparable. To this end,

DIF can be modeled by assigning these items disease-

speciﬁc parameters within a generalized partial credit

model that still pertains to all respondents. So it is assumed

that the same construct is measured in all subgroups, but

for some subgroups the item locations on the latent scale

are different. In this study, this was done in an iterative

procedure in which the item with the largest signiﬁcant LM

test was given disease-speciﬁc item parameters (for more

information on this procedure, see Glas and Verhelst [16]).

These iteration steps were repeated until no items were left

with signiﬁcant LM tests (P\0.01) or when the effect size

was below the set cut-off point (d

\0.10). The results of

these iteration steps are presented here as results of an

analysis consisting of two steps to enhance clarity.

The ﬁnal step in the statistical analyses was to assert that

the resulting model was valid in all disease groups, that is,

to assert that the same latent scale with disease-speciﬁc

item parameters for some of the items was applicable in all

disease groups. This was done again by computing LM

statistics: one targeted at the form of the item response

curves and one targeted at the assumption of local inde-

pendence. The latter assumption implies that item respon-

ses are independent given a person’s value of on the latent

variable. If this would not be the case, other, unaccounted,

variables inﬂuence response behavior and unidimension-

ality is violated.

Results

The results of the DIF analysis before modeling for pres-

ence of DIF are given in Table 2. Three items showed DIF

according to the criteria deﬁned earlier: Dressing and

grooming, Reach, and Activities. The LM statistics of those

items were signiﬁcant, and their effect sizes were larger

than 0.10. The item Dressing and grooming was given

disease-speciﬁc item parameters ﬁrst. In a second analysis,

the Activities item showed DIF. The process was repeated

until in the third analysis four items were given disease-

speciﬁc item parameters. The resulting item parameters are

shown in Table 3. It is important to note that the signiﬁcant

items in Table 2(Walking, Dressing and grooming, Reach,

Activities) are not completely analogous to the items in

Table 3(Walking, Dressing and grooming, Eating, Activ-

ities). The reason is that the presence of DIF items biases

the item parameter estimates of all items, both the items

with and without DIF. This motivates the iterative nature of

the procedure where items are processed one at a time.

To clarify the interpretation of the results in the table,

the item probabilities for OA and gout are illustrated in

Fig. 1. Consider the item Dressing and grooming. The

discrimination indices (under the heading a

in Table 3)

show that the item has the highest loading on the latent

dimension for the patients with gout and the lowest for the

patients with OA. Further, in Table 3and Fig. 1, it can be

seen that the category intersection parameters d

are higher

for the patients with gout than for the patients with OA.

This means that the expected score on Dressing and

Table 2 Outcomes of tests for DIF

HAQ-DI categories LM PAbs. DIF

Rising 3.33 0.19 0.03

Walking 25.02 0.00 0.09

Dressing and grooming 71.20 0.00 0.17

Reach 37.63 0.00 0.15

Eating 6.68 0.04 0.06

Grip 2.66 0.26 0.03

Activities 51.52 0.00 0.15

Hygiene 7.29 0.03 0.06

DIF Differential item functioning, HAQ-DI Health Assessment

Questionnaire Disability Index, LM Outcome Lagrange multiplier

test, Abs. DIF Amount of absolute DIF d

Degrees of freedom =2

Table 3 Item parameters after modeling DIF

HAQ-DI categories Item parameters

aid

Rising 3.758 -0.089 3.777 4.660

Walking

RA 3.253 0.429 3.568 6.534

OA 3.691 -0.515 3.878 6.875

Gout 3.987 0.196 3.844 9.377

Dressing and grooming

RA 3.285 -0.428 2.124 3.905

OA 2.532 0.101 2.885 3.666

Gout 3.850 3.066 4.832 4.994

Reach 2.671 -0.004 2.306 3.499

Eating

RA 2.629 -0.883 1.982 1.861

OA 3.077 -0.299 2.828 2.585

Gout 2.464 -0.079 2.786 1.673

Grip 3.824 -0.149 3.176 4.455

Activities

RA 2.915 -1.234 2.159 3.205

OA 2.431 -0.388 2.070 4.226

Gout 3.124 1.713 3.829 4.172

Hygiene 3.768 -1.557 2.089 3.242

DIF Differential item functioning, HAQ-DI Health Assessment

Questionnaire Disability Index, a

Discrimination parameter, d

and d

Category intersection parameters, RA Rheumatoid arthritis,

OA Osteoarthritis

Log likelihood =-5941.921

1258 Qual Life Res (2010) 19:1255–1263

123

grooming given a certain disability level is higher for the

patients with OA than for the patients with gout. That is,

patients with OA endorse this item more and the item

Dressing and grooming is more difﬁcult for them than for

the patients with gout.

The next question addressed is whether the scale pre-

sented in Table 3actually ﬁts the data. This was investi-

gated using two LM statistics [21], one targeted at the form

of the item response curves and one targeted at the

assumption of local independence. The ﬁrst statistic is

deﬁned analogous to the statistic for DIF, only this time the

subgroups are total-score level groups within the disease

groups. The observed total score is the sum score of the

responses on all items except the item targeted. Glas [21]

demonstrated that this statistic pertains to the hypothesis

that the response probabilities as a function of the latent

disability parameters are as predicted by the model. Within

the three disease groups, three total-score level groups were

formed in such a way that the numbers of respondents in

each group were approximately the same. The ranges of the

scores in the total-score level groups are given at the bot-

tom of the table. The results for the patients with RA are

shown in Table 4. The results for the two other diseases

were analogous. Note that none of the outcomes of the LM

tests were below the signiﬁcance level of 1%. The last

column gives the effect sizes d

. The highest effect size

was 0.05, which was well below the set criterion of 0.10.

The overall conclusion is that the model ﬁtted very well,

and the hypothesis that the same latent scale pertained to

the three diseases was not rejected.

The second test pertained to local independence. The

test is also sensitive to violations of unidimensionality. The

test targets the dependence between responses on pairs of

items. In the present case, responses to consecutive items

were evaluated, but this choice is not essential. The test

statistic is based on the evaluation of the average scores on

some item given the scores on some other item. With this

alternative deﬁnition of score groups, the test statistic is

deﬁned analogous to other LM statistics. The results for the

patients with RA are displayed in Table 5. As with the tests

for DIF and the form of the item response curves, the

results for the two other diseases were analogous and none

of the outcomes of the LM tests were below the signiﬁ-

cance level of 1%. The columns labeled ‘0’ to ‘3’ give the

observed and expected average scores on some item igiven

that the score on item i-1was ‘0’ to ‘3’, respectively. That

is, the average score on the item i=2, i.e., Walking, for

patients scoring ‘0’ on item i-1, i.e., Rising, was 0.18.

The associated expected score was 0.23. The last column

gives the effect sizes d

. The highest effect size was 0.10,

which just attained the criterion of 0.10. So again, the

predictions by the model were quite acceptable.

More easy

to perform

Walki ng

Dressing and grooming

Walking

Activities

Rising Rising

Grip Grip

Dressing and grooming

Activities

hcaeRhcaeR

Eating

Hyg Henei ygiene

More difficult

to perform

Osteoarthritis Gout

Fig. 1 An illustration of the item difﬁculty locations (average of the

3 category intersection parameters) of the Health Assessment

Questionnaire Disability Index on the IRT latent scale in patients

with osteoarthritis and gout

Table 4 Outcomes of tests for model ﬁt in score level groups for

patients with RA

HAQ-DI

categories

Total-score level groups

Level 1 Level 2 Level 3 d

LM PObs. Exp. Obs. Obs. Obs. Exp.

Rising 2.77 0.25 0.21 0.22 0.81 0.77 1.56 1.51 0.03

Walking 7.26 0.03 0.13 0.16 0.68 0.62 1.20 1.22 0.04

Dressing and

grooming

0.22 0.89 0.32 0.32 1.03 1.02 1.84 1.84 0.01

Reach 7.05 0.03 0.25 0.27 0.76 0.83 1.55 1.58 0.04

Eating 1.53 0.47 0.47 0.47 1.22 1.19 2.12 2.12 0.01

Grip 4.63 0.10 0.20 0.23 0.88 0.83 1.68 1.67 0.03

Activities 4.70 0.10 0.53 0.51 1.05 1.12 1.90 1.86 0.05

Hygiene 0.70 0.71 0.50 0.50 1.25 1.27 2.21 2.19 0.01

RA Rheumatoid arthritis, HAQ-DI Health Assessment Questionnaire

Disability Index, LM outcome Lagrange multiplier test, Obs.

Observed scores, Exp. Expected scores by the model, The observed

total score is the sum score of the responses on all items, deffect size

Level 1: total scores 0–4, Level 2: total scores 5–8, Level 3: total

scores 9–23

Qual Life Res (2010) 19:1255–1263 1259

123

The last question addressed concerns the impact of the

DIF for inferences concerning differences between the

three diseases on the latent scale. As mentioned previously,

the item parameters and the means and variances of the

latent person parameters were estimated by marginal

maximum likelihood. The obtained mean values of dis-

ability for each disease are presented in Fig. 2, together

with 99% conﬁdence intervals. The mean for the patients

with gout was set equal to zero to identify the latent scale.

Note that average disability of the respondents for each

disease decreases after the introduction of disease-speciﬁc

item parameters. Patients with OA had the highest average

disability level in all analyses. Patients with gout had the

lowest disability. From the conﬁdence intervals, it can be

inferred that conclusions from statistical tests would not

change. However, after modeling DIF, absolute score dif-

ferences clearly decreased.

Discussion

An item response theory (IRT)-based method is presented

that can be used to make HAQ-DI disability scores better

comparable across different rheumatic diseases, and the

results of the application of this method suggest that the

HAQ-DI can function as a generic instrument.

By now, there is extensive literature on the evaluation of

construct validity using factor analyses and IRT analyses

Table 5 Outcomes of tests for local independence for patients with RA: score level given the score level on the previous item

Category LM PScore on previous category

0123d

Obs. Exp. Obs. Exp. Obs. Exp. Obs. Exp.

2 8.03 0.05 0.18 0.23 0.80 0.76 1.28 1.26 1.61 1.70 0.09

3 3.65 0.30 0.53 0.51 1.28 1.30 1.91 1.97 2.83 2.79 0.06

4 2.25 0.52 0.37 0.34 0.81 0.83 1.35 1.40 1.99 2.04 0.05

5 2.50 0.48 0.61 0.64 1.38 1.34 1.95 1.95 2.57 2.49 0.06

6 7.69 0.05 0.22 0.26 0.86 0.80 1.27 1.25 1.78 1.84 0.06

7 0.34 0.95 0.57 0.57 1.18 1.19 1.82 1.82 2.40 2.30 0.10

8 1.01 0.80 0.50 0.47 1.17 1.18 1.84 1.85 2.45 2.46 0.03

RA Rheumatoid arthritis, The categories are numbered in the order as they appear in Table 4,LM outcome Lagrange multiplier test, Obs.

Observed scores, Exp. Expected scores by the model, The observed total score is the sum score of the responses on all items, deffect size

Fig. 2 Means of IRT disability

estimates (y-axis) in rheumatoid

arthritis (left panel) and

osteoarthritis (right panel)in

three analyses (x-axis) labeled

0, 1, and 2. The mean for gout

was set equal to zero to identify

the latent scale. Analysis 0 was

the initial analysis. In analyses 1

and 2, 2 and 4 items with

disease-speciﬁc item parameters

were introduced, respectively

1260 Qual Life Res (2010) 19:1255–1263

123

[22]. It is important to note that these two classes of models

are closely related. In fact, Takane and de Leeuw have

shown that under quite general assumptions, these two

models are equivalent [23]. Only the traditions of statistical

inference are different: factor analysis is usually based on a

covariance matrix, while IRT analysis is based on the

complete response patterns. This motivates the term ‘‘full-

information factor analysis’’ used for multidimensional

IRT by Bock, Gibbons, and Muraki [24]. Both approaches

have their advantages and disadvantages. One of the

advantages of the IRT approach is that it uses more

information in the data and, therefore, assumptions such as

the form of the item response curves and local indepen-

dence can be investigated. However, the results obtained

using both approaches are closely related. In that sense, the

correlated residuals reported by Cole [25,26] can be

interpreted as an indication for lack of local independence

and multidimensionality, which can be further investigated

in detail using IRT-based techniques. Although both factor

analysis and IRT can be used to assess the construct

validity of the HAQ-DI, it is important to note that con-

struct validity is not so much a property of an instrument,

but a property of inferences made using the instrument

[27]. In the present study, it was shown that when a number

of disease-speciﬁc item parameters are used and the HAQ-

DI is scored using h-estimates, these h-estimates relate to

the same unidimensional scale. Therefore, these scores can

support the construct validity of the HAQ-DI for inferences

across diseases.

IRT methods offer a sophisticated and robust means to

test the generic nature of an instrument by examining

whether the underlying latent scale is the same for different

groups of individuals. This can be evaluated by examining

whether the questionnaire contains items with differential

item functioning (DIF), i.e., items where the probability of

scoring in the various response categories differs between

subgroups of patients after controlling for the general dis-

ability level as estimated by the IRT model. Although IRT-

based approaches to DIF detection have been increasingly

used in health outcomes assessment, research addressing

the measurement equivalence of disability scales across

different (rheumatic) diseases is still scarce. Only one

recent study was found that examined DIF for the HAQ-DI

and the 10-item physical functioning scale (PF) of the

SF-36 between patients with RA and psoriatic arthritis

(PsA) using Rasch analysis [28]. This study found evidence

of marked DIF for three HAQ-DI items, similar to our

study, and relatively minor DIF for the SF-36 PF scale. The

authors concluded that the SF-36 PF scale is a better

instrument than the HAQ-DI for comparing disability from

PsA with disability from other diseases. However, the

study did not evaluate the impact of DIF on individual

items for inferences about total HAQ-DI score differences

between the diseases or provide guidelines on how to deal

with this DIF. Therefore, the objective of this study was

twofold: ﬁrst to investigate whether the HAQ-DI functions

as a generic measure of disability across different rheu-

matic diseases by evaluating DIF and second, if not, to illus-

trate the use of IRT methods to model DIF so that disability

scores can be made comparable across diseases. For this

purpose, we used data from three common rheumatic dis-

eases with known differences in disease characteristics:

rheumatoid arthritis (RA), osteoarthritis (OA), and gout.

As would be expected, the majority of the patients with

RA and OA were women, whereas patients with gout

patients were predominantly men. Mean SF-36 physical

and mental component scores were well below the average

of 50 in the general population, suggesting that all three

diseases have a substantial impact on general health status.

Whereas disability scores between RA and OA were very

similar, mean HAQ-DI scores were clearly lower for

patients with gout and in close correspondence to a recently

reported mean HAQ-DI score of 0.59 in a cross-sectional

gout sample [29].

However, half of the HAQ-DI items displayed sub-

stantial DIF between the three diseases, possibly biasing

total score differences between the diseases. After modeling

these items by assigning them disease-speciﬁc parameters,

statistical conclusions regarding disability differences

across the 3 conditions did not change. Patients with OA

and RA still displayed higher disability scores than patients

with gout. However, absolute differences between the dis-

eases were attenuated. HAQ-DI scores based on disease-

speciﬁc item parameters ﬁtted the data very well and

resulted in an underlying latent scale that applied to all three

diseases.

An important concern, however, is that only four items

served as anchors across the three diseases, and these items

appear to be on the ‘‘more difﬁcult’’ end of the scale. To

minimize the standard errors of differences between dis-

ability estimates in the different disease groups, anchoring

should be preferably done in all sections of a scale. Often,

this cannot be achieved, but it should be kept in mind that

the precision of the method deteriorates with the number of

anchor items and their position on the scale. The authors do

not recommend using the method when the anchor is very

small (e.g., less than 4 items or less than 50% of the items).

It is also important to emphasize here that the present

study focused only on cross-sectional samples of patients

with OA, RA, and gout as an example for evaluating the

generic nature of the HAQ-DI. It is very well possible that

these or other items of the HAQ-DI may show DIF, pos-

sibly to a different extent, between other rheumatic con-

ditions, non-rheumatic conditions, or general population

samples. Accordingly, researchers using the HAQ-DI to

compare disability between different subgroups are

Qual Life Res (2010) 19:1255–1263 1261

123

encouraged to examine DIF before comparing total HAQ-

DI scores. The present study provides an example of how

IRT methods can be used to evaluate DIF and, if necessary,

how to model this DIF to obtain more accurate disability

estimates.

Furthermore, all analyses presented in this study are

based on so-called standard scores of the HAQ-DI, which

take into account the use of aids and devices or assistance

from another person [1,3]. Although this scoring method is

most frequently used and recommended [30], some clinical

investigations have used an alternative scoring without this

correction. Secondary analysis using the alternative scoring

method in this study showed that the IRT results obtained

with and without correction were very similar.

In summary, the results of this study showed that 4 out

of the 8 disability items displayed substantial DIF across

the 3 diseases, indicating that the HAQ-DI may not fully

function as a generic instrument for the assessment of

disability across different rheumatic diseases unless DIF is

modeled and adjustments to the scoring method are made.

Open Access This article is distributed under the terms of the

Creative Commons Attribution Noncommercial License which per-

mits any noncommercial use, distribution, and reproduction in any

medium, provided the original author(s) and source are credited.

References

1. Bruce, B., & Fries, J. F. (2003). The Stanford health assessment

questionnaire: A review of its history, issues, progress, and

documentation. Journal of Rheumatology, 30(1), 167–178.

2. Fries, J. F., Spitz, P., Kraines, R. G., & Holman, H. R. (1980).

Measurement of patient outcome in arthritis. Arthritis and

Rheumatism, 23(2), 137–145.

3. Bruce, B., & Fries, J. F. (2005). The health assessment ques-

tionnaire (HAQ). Clinical and Experimental Rheumatology, 23(5

Suppl 39), S14–S18.

4. Husted, J. A., Gladman, D. D., Farewell, V. T., & Cook, R. J.

(2001). Health-related quality of life of patients with psoriatic

arthritis: a comparison with patients with rheumatoid arthritis.

Arthritis and Rheumatism, 45(2), 151–158.

5. Johnson, S. R., Glaman, D. D., Schentag, C. T., & Lee, P. (2006).

Quality of life and functional status in systemic sclerosis com-

pared to other rheumatic diseases. Journal of Rheumatology,

33(6), 1117–1122.

6. Lindqvist, U. R., Alenius, G. M., Husmark, T., Theander, E.,

Holmstrom, G., & Larsson, P. T. (2008). The Swedish early pso-

riatic arthritis register–2-year followup: a comparison with early

rheumatoid arthritis. Journal of Rheumatology, 35(4), 668–673.

7. Martinez, J. E., Ferraz, M. B., Sato, E. I., & Atra, E. (1995).

Fibromyalgia versus rheumatoid arthritis: A longitudinal com-

parison of the quality of life. Journal of Rheumatology, 22(2),

270–274.

8. Slatkowsky-Christensen, B., Mowinckel, P., Loge, J. H., &

Kvien, T. K. (2007). Health-related quality of life in women with

symptomatic hand osteoarthritis: a comparison with rheumatoid

arthritis patients, healthy controls, and normative data. Arthritis

& Rheumatism (Arthritis Care & Research), 57(8), 1404–1409.

9. Sokoll, K. B., & Helliwell, P. S. (2001). Comparison of disability

and quality of life in rheumatoid and psoriatic arthritis. Journal of

Rheumatology, 28(8), 1842–1846.

10. Camilli, G., & Sheppard, L. A. (1994). Methods for identifying

biased test items. Thousand Oaks, CA: Sage.

11. Holland, P. W., & Wainer, H. (1994). Differential item func-

tioning. Hillsdale, NJ: Erlbaum.

12. Chang, H. H., & Mazzeo, J. (1994). The unique correspondence

of the item response function and item category response func-

tions in polytomously scored item response models. Psychomet-

rika, 59(3), 391–404.

13. Muraki, E. (1992). A generalized partial credit model: Applica-

tion of an EM algorithm. Applied Psychological Measurement,

16(2), 159–176.

14. Gebhardt, E., & Adams, R. J. (2007). The inﬂuence of equating

methodology on reported trends in PISA. Journal of Applied

Measurement, 8(3), 305–322.

15. Glas, C. A. W. (1998). Detection of differential item functioning

using Lagrange multiplier tests. Statistica Sinica, 8(3), 647–667.

16. Glas, C. A. W., & Verhelst, N. D. (1995). Testing the Rasch

model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models:

Foundations, recent developments, and applications (pp. 69–96).

New York: Springer.

17. Grisay, A., de Jong, J. H., Gebhardt, E., Berezner, A., & Halleux-

Monseur, B. (2007). Translation equivalence across PISA coun-

tries. Journal of Applied Measurement, 8(3), 249–266.

18. de Jong, M. G., Steenkamp, J. B. E. M., & Fox, J. P. (2007).

Relaxing measurement invariance in cross-national consumer

research using a hierarchical IRT model. Journal of consumer

research, 34(2), 260–278.

19. ten Klooster, P. M., Taal, E., & van de Laar, M. A. (2008). Rasch

analysis of the Dutch health assessment questionnaire disability

index and the health assessment questionnaire II in patients with

rheumatoid arthritis. Arthritis & Rheumatism (Arthritis Care &

Research), 59(12), 1721–1728.

20. Ware, J. E., Kosinski, M., & Dewey, J. E. (2000). How to score

version 2 of the SF-36 health survey. Lincoln, RI: Quality Metric

Incorporated.

21. Glas, C. A. W. (1999). Modiﬁcation indices for the 2-PL and the

nominal response model. Psychometrika, 64(3), 273–294.

22. Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis

of the measurement invariance literature: Suggestions, practices,

and recommendations for organizational research. Organizational

Research Methods, 3(1), 4–70.

23. Takane, Y., & de Leeuw, J. (1987). On the relationship between

item response theory and factor analysis of discredited variables.

Psychometrika, 52(3), 393–408.

24. Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-Information

item factor analysis. Applied Psychological Measurement, 12(3),

261–280.

25. Cole, J. C., Khanna, D., Clements, P. J., Seibold, J. R., Tashkin,

D. P., Paulus, H. E., et al. (2006). Single-factor scoring validation

for the health assessment questionnaire-disability index (HAQ-

DI) in patients with systemic sclerosis and comparison with early

rheumatoid arthritis patients. Quality of Life Research, 15(8),

1383–1394.

26. Cole, J. C., Motivala, S. J., Khanna, D., Lee, J. Y., Paulus, H. E.,

& Irwin, M. R. (2005). Validation of single-factor structure and

scoring protocol for the health assessment questionnaire-disabil-

ity index. Arthritis & Rheumatism (Arthritis Care & Research),

53(4), 536–542.

27. Sireci, S. G. (2009). Packing and unpacking sources of validity

evidence: History repeats itself again. In R. W. Lissitz (Ed.), The

concept of validity: Revisions, new directions, and applications

(pp. 19–37). Charlotte, NC: Information Age Publishing.

1262 Qual Life Res (2010) 19:1255–1263

123

28. Taylor, W. J., & McPherson, K. M. (2007). Using Rasch analysis

to compare the psychometric properties of the short form 36

physical function score and the health assessment questionnaire

disability index in patients with psoriatic arthritis and rheumatoid

arthritis. Arthritis & Rheumatism (Arthritis Care & Research),

57(5), 723–729.

29. Alvarez-Hernandez, E., Pelaez-Ballestas, I., Vazquez-Mellado, J.,

Teran-Estrada, L., Bernard-Medina, A. G., Espinoza, J., et al.

(2008). Validation of the health assessment questionnaire dis-

ability index in patients with gout. Arthritis & Rheumatism

(Arthritis Care & Research), 59(5), 665–669.

30. Zandbelt, M. M., Welsing, P. M., van Gestel, A. M., & van Riel,

P. L. (2001). Health assessment questionnaire modiﬁcations: is

standardisation needed? Annals of the Rheumatic Diseases, 60(9),

841–845.

Qual Life Res (2010) 19:1255–1263 1263

123

Correlation between disease activity and patient-reported health-related quality of life in rheumatoid arthritis: a cross-sectional study

Article

Full-text available

May 2024

Objective We aimed to provide a comprehensive assessment of health-related quality of life (HRQoL) in patients with rheumatoid arthritis (RA) of different activities and to evaluate the correlation between clinical activity measures and HRQoL instruments. This research also analysed the extent to which different aspects of HRQoL (physical, psychological and social) were affected. Design Cross-sectional, observational, non-interventional study. Setting The study was conducted at the Department of Rheumatology and Immunology, Qilu Hospital, Shandong University. Methods From December 2019 to October 2020, a total of 340 RA patients participated in the survey using convenient sampling. Three generic instruments, EQ-5D-5L,SF-12 and the AQoL-4D, as well as an RA-specific instrument,the Stanford Health Assessment Questionnaire Disability Index (HAQ-DI), were administered to assess patients’ HRQoL. The Disease Activity Score 28-Erythrocyte Sedimentation Rate (DAS28-ESR) was used by doctors to measure patients’ clinical activity. Multivariable linear regression was used to compare patients’ HRQoL across different levels of activity. Spearman’s correlation was used to assess the correlation between doctor-reported clinical activity and HRQoL. Results A total of 314 patients with RA participated in this study. The mean score of HAQ-DI was 0.87 (SD: 0.91). Using patients in the clinical remission group as a reference, patients in the moderate and high disease activity groups showed significantly reduced health state utility values and HRQoL scores (all p<0.05). On the contrary, there was an increase in HAQ-DI scores, indicating more impairment (p<0.05). All instruments included in the study tended to differentiate disease activity based on multiple criteria, with scores showing a moderate to strong correlation with RA activity (|r s |=0.50 to 0.65). Among them, the disease-specific instrument had the highest correlation. Conclusions RA can have considerable impairment on patients’ HRQoL, both in terms of physical and psychosocial functioning. Given the strong correlation between clinical activity and HRQoL scores, and the fact that HRQoL can be an important clinical supplement. The EQ-5D-5L is probably the most appropriate generic measurement instrument for measuring HRQoL in RA patients.

Social Listening in Gout: Impact of Proactive vs. Reactive Management on Self-Reported Emotional States

Article

Full-text available

Jan 2024

This study aimed to characterize patient-reported outcomes from social media conversations in the gout community. The impact of management strategy differences on the community’s emotional states was explored. We analyzed two social media sources using a variety of natural language processing techniques. We isolated conversations with a high probability of discussing disease management (score > 0.99). These conversations were stratified by management type: proactive or reactive. The polarity (positivity/negativity) of language and emotions conveyed in statements shared by community members was assessed by management type. Among the statements related to management, reactive management (e.g., urgent care) was mentioned in 0.5% of statements, and proactive management (e.g., primary care) was mentioned in 0.6% of statements. Reactive management statements had a significantly larger proportion of negative words (59%) than did proactive management statements (44%); “fear” occurred more frequently with reactive statements, whereas “trust” predominated in proactive statements. Allopurinol was the most common medication in proactive management statements, whereas reactive management had significantly higher counts of prednisone/steroid mentions. A unique aspect of examining gout-related social media conversations is the ability to better understand the intersection of clinical management and emotional impacts in the gout community. The effect of social media statements was significantly stratified by management type for gout community members, where proactive management statements were characterized by more positive language than reactive management statements. These results suggest that proactive disease management may result in more positive mental and emotional experiences in patients with gout.

Capturing Patient Value in an Economic Evaluation

Article

Full-text available

Dec 2023

Objective Economic evaluations predominantly use generic outcomes, such as the Euro Quality of Life‐5 Dimension (EQ‐5D), to assess health status. However, because of the generic nature, they are less suitable to capture the quality of life of patients with specific conditions. Given the transition to patient‐centered (remote) care delivery, this study aims to evaluate the possibility of using disease‐specific measures in a cost‐effectiveness analysis. Methods A real‐life cohort from Maasstad Hospital (2020–2021) in the Netherlands, with 772 patients with rheumatoid arthritis (RA), was used to assess the cost‐effectiveness of electronic consultations (e‐consultations) compared with face‐to‐face consultations. The Incremental Cost‐Effectiveness Ratio (ICER), based on the generic EQ‐5D, was compared with ICER's based on RA‐specific measures: the Rheumatoid Arthritis Impact of Disease (RAID) and Health Assessment Questionnaire‐Disability Index (HAQ‐DI). To compare the cost‐effectiveness of these different measures, HAQ‐DI and RAID were expressed in quality‐adjusted life‐years (QALYs) via estimated conversion equations. Results Disease‐specific patient‐reported outcome measures (PROMs) offer a promising alternative for traditional measures in economic evaluations, capturing patient‐relevant domains more comprehensively. Because PROMs are increasingly applied in clinical practice, the next step entails modeling of an RA patient‐wide conversion equation to implement PROMs in economic evaluations. Conclusion The conventional ICER (eg, EQ‐5D) indicates that e‐consultations are cost‐effective with cost savings of −€161,000 per QALY gained for a prevalent RA cohort treated in a secondary trainee hospital. RA‐specific measures show similar results, with ICERs of −€163,000 per HAQ‐DI (QALY) and −€223,000 per RAID (QALY) gained. RA‐specific measures capture patient‐relevant domains and offer the opportunity to improve the assessment and treatment of the disease impact.

Prevalence of polyneuropathies among systemic sclerosis patients and impact on health-related quality of life

Article

Mar 2023

Introduction: Systemic sclerosis (SSc) is a chronic rheumatic disease that affects multiple organ systems, including the peripheral nervous system. However, studies into the involvement of polyneuropathies (PNP) have shown inconsistent results. The aim of this study was to determine the prevalence of small (SFN) and large (LFN) fibre neuropathy among SSc patients and the impact on health-related quality of life (HRQoL). Material and methods: The study enrolled 67 patients with diagnosed SSc. The severity of neuropathic symptoms was evaluated using shortened and revised total neuropathy scoring criteria. Nerve conduction studies were used for LFN, and quantitative sensory testing was used to evaluate SFN. Neuropathic pain was evaluated using a Douleur Neuropathique en 4 questionnaire, and the severity of anxiety symptoms was assessed using a Generalised Anxiety Disorder-7 scale. The Health Assessment Questionnaire-Disability Index was used to assess HRQoL. Previous data on antinuclear autoantibodies (ANA) test results was obtained. Statistical analysis was performed using SPSS software. Results: LFN was diagnosed in 47.8% (n = 32/67) and SFN in 40.3% (n = 27/67) of the subjects. ANA positivity was not associated with the presence of LFN/SFN. The severity of neuropathic pain had a significant correlation with anxiety symptoms (r = 0.61, p < 0.001), the severity of neuropathy symptoms (r = 0.51, p < 0.001) and HRQoL (r = 0.45, p < 0.001). The severity of neuropathy symptoms correlated with HRQoL (r = 0.39, p = 0.001). Conclusions: We demonstrated that PNP are found in almost all SSc patients. Also, SFN is as common as LFN. Additionally, we found that the severity of neuropathy symptoms and neuropathic pain are both associated with a worse HRQoL.

Geriatric Syndromes among Patients with Rheumatoid Arthritis: A Comparison between Young and Elderly Patients

Article

Full-text available

Jul 2022

Background: In the general geriatric population, Geriatric syndromes (GSs) predict greater likelihood of hospitalization, increased health care use and cost. The present study aimed to compare GSs among young and elderly patients with rheumatoid arthritis (RA). Methods: In a cross-sectional study a total of 98 participants, including 65 elderly (≥60 years) and 33 young adult patients (<60 years) with RA who referred to the geriatric and rheumatologic clinic were enrolled. Patients were categorized into three groups (healthy elderly, n=27; elderly with RA, n=38; and young people with RA, n=33). GSs were assessed using mini-mental state exam (MMSE), five-item geriatric depression scale-15 (GDS-15), mini nutritional assessment (MNA), and asking patients about history of falls in the past year. The RA activity in patients was assessed using disease activity for rheumatoid arthritis score-28 (DAS-28) scale, serum ESR (mm/h) level. Results: There was a statistically significant differences in terms of DAS-28 (2.23±1.01 vs. 0.64±0.97, P=0.025) and ESR (28.10±6.64 vs. 23.09±7.65 mm/h, P=0.042) between healthy elderly and RA elderly patients. Elderly patients with RA were significantly more prone to have cognitive impairment (P=0.002), fall (P=0.005), malnutrition (P<0.001), urinary incontinence (P<0.001), and functional disability (P=0.021) compared to healthy elderlies and young RA patients. The results of binary logistic regression revealed that in elderly RA patients, higher DAS-28 score [odds ratio (OR) = 1.96; 95% CI 1.03, 3.84; P=0.041] was an independent risk factors for the GSs. Conclusion: The prevalence of some features of GSs were higher in the elderly RA patients than healthy elderly and young RA patients.

The Effect of Vitamin D Supplementation on Treatment-Induced Pain in Cancer Patients: A Systematic Review

Article

Mar 2022
PAIN MANAG NURS

Objectives Despite the widespread use of complementary and alternative medicine by patients and physicians alike, there is no accurate evidence regarding the effects of vitamin D supplementation on treatment-induced pain in cancer patients. Thus, the aim of this systematic review of randomized controlled trials (RCTs) was to evaluate the impact of vitamin D administration on therapy-related pain in subjects diagnosed with malignant disorders. Review analysis methods We searched the Web of Science, Scopus, PubMed/Medline, Embase, and Google Scholar databases up to October 2020 to identify published RCTs that investigated the use of vitamin D in the management of treatment-induced pain in individuals with cancer. Results Nine RCTs were detected. The median duration of the intervention was of 24 weeks (range 12-52 weeks) and dose of vitamin D employed was 2000-50000 IU of vitamin D3 weekly orally each day. Six RCTs reported a significant reduction in pain, whereas three did not detect a notable decrease of this variable. Of the six studies that reported an alleviation of pain, an RCT which recruited 60 participants and lasted for 24 weeks consisted of supplementation with high doses of vitamin D2 weekly for 8 weeks in women receiving anastrozole as adjuvant therapy, then supplementation with vitamin D2 monthly for 4 months, effectively alleviated the aromatase inhibitor-associated musculoskeletal syndrome (AIMSS). The results of the same RCT also suggested a beneficial effect of vitamin D on musculoskeletal pain. Conclusions Our results suggest that the supplementation with high doses of vitamin D in cancer patients with low serum levels of vitamin D, can be effective in reducing treatment-related pain.

The role of IL-18 in addition to Th17 cytokines in rheumatoid arthritis development and treatment in women

Article

Full-text available

Jul 2021

We aimed to analyze serum pro-inflammatory profiles of female rheumatoid arthritis (RA) patients and compare them with healthy women to establish the relative importance of pro-inflammatory cytokines in RA and their relation with different treatment regimens. Levels of six cytokines were determined by ELISA assays. A supervised dimensionality reducing approach (PLS-DA Analysis) was applied. All of the cytokines assayed were significantly elevated in the sera of RA female patients than healthy controls with fold change: 21-fold for IL-6; 6.1-fold for IL-17A; 2.5-fold for IL-23; 2.3-fold for IL-18; 1.94-fold for TNF-α; 1.7-fold for IL-12p40. According to the results of the PLS-DA analysis, IL-17A, IL-18, and TNF-α were of higher importance rank compared to IL-23 and IL-12p40. Women in the early stage of RA displayed significantly elevated IL-17A levels than those with longer disease duration: 8.04 pg/ml [8.04–175.3] vs 4.64 pg/ml [2.95–13.31], p = 0.007. IL-6 serum levels were related to higher disease activity. We have demonstrated altered cytokine production within female RA patients on different treatment regimens. Those on Tocilizumab therapy showed elevated IL-6 levels and decreased IL-17A versus the rest of the patients’ subgroups. In conclusion, our data support the pivotal role of IL-18 in addition to IL-6, IL-17A, and TNF-α as the hierarchical cytokines in the pathogenesis of RA, particularly valid for women. Therapy with biological agents targeting IL-18 in addition to the Th17 axis may be an adequate approach in RA patients.

Safety and Efficacy of Radiosynoviorthesis: A Prospective Canadian Multicenter Study

Article

May 2024

Radiosynoviorthesis is approved in several European countries and the United States to treat refractory synovitis in many inflammatory joint diseases, such as rheumatoid arthritis, spondyloarthropathies, and other arthritic joint diseases. No radiopharmaceuticals for radiosynoviorthesis are currently approved in Canada. The aim of this Health Canada-approved trial was to demonstrate the safety and efficacy of radiosynoviorthesis. Methods: Between July 2012 and November 2017, we conducted a multicenter, prospective, interventional Canadian trial. Patients (n = 360) with synovitis refractory to standard treatments after failing 2 intraarticular glucocorticoid injections were included. They were followed up at 3, 6, and 12 mo. Outcome measures included adverse events (AEs) and clinical signs of synovitis (pain, swelling, and joint effusion) measured with the Health Assessment Questionnaire Disability Index, the Disease Activity Score, and the Visual Analog Scale. Results: In total, 392 joints were treated, including those reinjected after 6 mo (n = 34). Of these, 83.4% (327/392) were injected with [90Y]Y-citrate for the knees and 9.9% (39/392) with [186Re]Re-sulfide for medium-sized joints. Of the joints treated, 82.7% (324/392) were knees. Fifty-five AEs, most of them of mild grade, occurred and resolved without sequelae and were not life-threatening. The incidence of radiosynoviorthesis-related AEs was 9.4% (34/360). The proportion of patients showing an improvement in synovitis symptoms after radiosynoviorthesis was significant at 3 mo and was maintained up to 12 mo (P < 0.001). Conclusion: This study confirmed the safety of radiosynoviorthesis in the treatment of patients with synovitis refractory to standard treatments. There is evidence of sustained clinical efficacy at 12 mo, suggesting that radiosynoviorthesis is an effective treatment for improving synovitis symptoms.

Measuring Disease Activity and Functional Capacity in Telerheumatology

Chapter

Jul 2022

J. Steuart Richards

Patient-reported outcome measures (PROM) play a larger role in estimating disease status in telerheumatology. The National Institutes of Health (NIH) developed the Patient-Reported Outcomes Measurement Information System (PROMIS®) to measure outcomes for several diseases including rheumatic diseases. Measuring disease activity and functional capacity is standard of care for patients with rheumatoid arthritis (RA) or spondyloarthritis (SpA). The Routine Assessment of Patient Index Data 3 (RAPID3) and the Multidimensional Health Assessment Questionnaire (MDHAQ) may be used to measure disease activity and functional capacity respectively for several autoimmune and inflammatory rheumatic diseases (AIRD) and do not require modification for telerheumatology encounters. Estimating disease activity requires a more active role by the patient and specific laboratory testing. Instruments to monitor disease activity and functional status for patients receiving care by telerheumatology must be developed and validated to improve access to high-quality care for all patients with AIRD.

Musculoskeletal health and capability wellbeing: Associations between the HAQ-DI, ICECAP-A and ICECAP-O measures in a population survey

Article

Jun 2021

Background The capability approach has received increasing attention in wellbeing measurement in the past years, but it has still remained an underexplored area in musculoskeletal (MSK) health. Objective We aimed to explore the capability wellbeing in relation to MSK health, by measuring the associations between the Health Assessment Questionnaire Disability Index (HAQ-DI) physical functioning and the ICECAP-A and ICECAP-O capability wellbeing measures. Design A cross-sectional survey was performed in 2019 on a representative sample of the Hungarian general adult population. Method Capability wellbeing was measured by the ICECAP-A (age-group 18-64) and ICECAP-O (age group 65+) questionnaires. MSK health was defined by the HAQ-DI, the mobility domain of the EQ-5D-3L/-5L health status measures, self-reported walking problems and MSK diagnosis (neck/back/low back defects, hip/knee arthrosis, osteoporosis). Results Altogether 2,021 individuals (female: 50.1%) participated in the survey with mean (SD) age of 48.7 (17.9) years and HAQ-DI of 0.138 (0.390). ICECAP-A (N=1568, 77.6%) and ICECAP-O (N=453, 22.4%) scores were on average (SD) 0.894 (0.126) and 0.828 (0.150), respectively. Spearman correlations between the HAQ-DI and ICECAP-A/-O index scores were moderate (r=-0.303 and -0.496; p<0.05). Both the ICECAP-A/-O index scores differed significantly (ANOVA test, p<0.05) across all MSK subgroups. In the ordinary least square regressions, marginal effects of ICECAP-A/-O scores on HAQ-DI were significant (-0.149 and -0.123) when controlling for socio-demographic characteristics. Conclusions MSK health problems are associated with lower capability wellbeing. ICECAP-A/-O might capture effects of MSK conditions not measured by the HAQ-DI or the EQ-5D-5L. Further studies should to test these associations in disease-specific samples.

Quality of life and functional status in systemic sclerosis compared to other rheumatic diseases

Article

Jun 2006

Objective. To assess clinical factors associated with disability and physical health in patients with systemic sclerosis (SSc) compared to psoriatic arthritis (PsA), systemic lupus erythematosus (SLE), and rheumatoid arthritis (RA) and healthy controls. Methods. Eighty-two patients with SSc, 82 with PsA, 74 with SLE, 42 with RA, and 60 controls were recruited from various rheumatology clinics and underwent physical examination, tender point count, Health Assessment Questionnaire Disability Index (HAQ-DI) and Short Form-36 Health Survey (SF-36) assessments. Results. SSc patients were younger and had shorter disease duration than the comparator groups. SSc patients with joint involvement had significantly poorer HAQ-DI scores than patients with PsA (1.43 vs 0.84; p < 0.05), and had higher visual analog scale pain scores than RA patients (1.37 vs 1.01; p < 0.05). The SF-36 Physical Component Summary and HAQ-DI score in SSc patients were adversely affected by joint involvement (p < 0.01, p < 0.001, respectively), >= 11 tender points (p < 0.01 p < 0.001), gastrointestinal (GI) involvement (p < 0.01, p < 0.01), and high skin score (p = 0.02 p < 0.001). Conclusion. Physical health relating to quality of life is adversely affected in patients with SSc. Disability is associated with the presence of >= 11 tender points, a high skin score. and joint and GI involvement. Joint involvement in SSc is more disabling than joint involvement in PsA; and patients with SSc experience more severe pain than patients with RA.

Full-information item factor analysis

Article

Jan 1988
APPL PSYCH MEAS

Describes a method of item factor analysis based on Thurstone's multiple-factor model and implemented by marginal maximum likelihood estimation and the em algorithm. Statistical significance of successive factors added to the model were tested by the likelihood ratio criterion. Provisions for effects of guessing on multiple-choice items, and for omitted and not-reached items, are included. Bayes constraints on the factor loadings were found to be necessary to suppress Heywood cases. Applications to simulated and real data are presented to substantiate the accuracy and practical utility of the method. (PsycINFO Database Record (c) 2000 APA, all rights reserved)(unassigned)

Full-information item factor analysis

Article

Jan 1988
APPL PSYCH MEAS

Modification indices for the 2-PL and the nominal response model

Article

Sep 1999

Cees Glas

In this paper, it is shown that various violations of the 2-PL model and the nominal response model can be evaluated using the Lagrange multiplier test or the equivalent efficient score test. The tests presented here focus on violation of local stochastic independence and insufficient capture of the form of the item characteristic curves. Primarily, the tests are item-oriented diagnostic tools, but taken together, they also serve the purpose of evaluation of global model fit. A useful feature of Lagrange multiplier statistics is that they are evaluated using maximum likelihood estimates of the null-model only, that is, the parameters of alternative models need not be estimated. As numerical examples, an application to real data and some power studies are presented.

Detection of differential item functioning using Lagrange multiplier tests

Article

Jul 1998
STAT SINICA

Cees Glas

It is shown that differential item functioning can be evaluated using the Lagrange multiplier test or Rao’s efficient score test. The test is presented in the framework of a number of Item Response Theory (IRT) models such as the Rasch model, the one-parameter logistic model, the 2-parameter logistic model, the generalized partial credit model and the nominal response model. However, the paradigm for detection of differential item functioning presented here also applies to other IRT models. Two examples are given, one using simulated data and one using real data.

Methods for Identifying Biased Test Items

Book

Jan 1994

This book makes clear to researchers what item-bias methods can (and cannot) do, how they work and how they should be interpreted. Advice is provided on the most useful methods for particular test situations. The authors explain the logic of each method - from item-response theory to nonparametric, categorical methods - in terms of how differential item functioning (DIF) is defined by the method and how well the method can be expected to work. A summary of findings on the behaviour of indices in empirical studies is included. The book concludes with a set of principles for deciding when DIF should be interpreted as evidence of bias.

How to Score Version 2 of the SF-36® Health Survey

Article

Jan 2000

Thank you for your interest in: Ware, JE, Jr., Kosinski, M, Bjorner, JB, et al., User’s manual for the SF-36v2® Health Survey (2nd ed.). Lincoln, RI: QualityMetric Incorporated, 2007. This 309-page manual is available at and for loan from many university research libraries. Because it is the most requested book; Boston University research library has multiple copies for library loan. Good luck with your research, John E. Ware, Jr., PhD

A Generalized Partial Credit Model: Application of an EM Algorithm

Article

Jun 1992
APPL PSYCH MEAS

Eiji Muraki

The Partial Credit model with a varying slope parameter has been developed, and it is called the Generalized Partial Credit model. The item step parameter of this model is decomposed to a location and a threshold parameter, following Andrich's Rating Scale formulation. The EM algorithm for estimating the model parameters was derived. The performance of this generalized model is compared with a Rasch family of polytomous item response models based on both simulated and real data. Simulated data were generated and then analyzed by the various polytomous item response models. The results obtained demonstrate that the rating formulation of the Generalized Partial Credit model is quite adaptable to the analysis of polytomous item responses. The real data used in this study consisted of NAEP Mathematics data which was made up of both dichotomous and polytomous item types. The Partial Credit model was applied to this data using both constant and varying slope parameters. The Generalized Partial Credit model, which provides for varying slope parameters, yielded better fit to data than the Partial Credit model without such a provision. Index terms: item response model polytomous item response model the Partial Credit model the Rating Scale model the Nominal Response model NAEP

Testing the Rasch Model

Chapter

Jan 1995

It is shown that the problem of evaluating model fit can be solved within the framework of the general multinomial model, and it is shown how tests for this framework can be adapted to the Rasch model. Four types of tests are considered: generalized Pearson tests, likelihood ratio tests, Wald tests, and Lagrange multiplier tests. The statistics presented not only support the purpose of a global overall model test, but also provide information with respect to specific model violations, such as violation of sufficiency of the sum score, strictly monotone increasing and parallel item response functions, unidimensionality, and differential item functioning.

A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research

Article

Jan 2000

The establishment of measurement invariance across groups is a logical prerequisite to conducting substantive cross-group comparisons (e.g., tests of group mean differences, invariance of structural parameter estimates), but measurement invariance is rarely tested in organizational research. In this article, the authors (a) elaborate the importance of conducting tests of measurement invariance across groups, (b) review recommended practices for conducting tests of measurement invariance, (c) review applications of measurement invariance tests in substantive applications, (d) discuss issues involved in tests of various aspects of measurement invariance, (e) present an empirical example of the analysis of longitudinal measurement invariance, and (f) propose an integrative paradigm for conducting sequences of measurement invariance tests.

Application of the Health Assessment Questionnaire disability index to various rheumatic diseases

Abstract and Figures

Recommended publications

FLOURISH, SHARE, AND SHAPE YOUR WORLD WITH A JOB THAT MATTERS.

SP0163 Interventions to improve medication adherence in patients with inflammatory arthritis

Diagnose Rheuma: Lebensqualität mit einer entzündlichen Gelenkerkrankung

How important is patient education?

Osteoarticular disease in elderly patients

Towards standardized patient reported physical function outcome reporting: linking ten commonly used...