ArticlePDF Available

Application of the Health Assessment Questionnaire disability index to various rheumatic diseases

Authors:

Abstract and Figures

To investigate whether the Stanford Health Assessment Questionnaire Disability Index (HAQ-DI) can serve as a generic instrument for measuring disability across different rheumatic diseases and to propose a scoring method based on item response theory (IRT) modeling to support this goal. The HAQ-DI was administered to a cross-sectional sample of patients with confirmed rheumatoid arthritis (n = 619), osteoarthritis (n = 125), or gout (n = 102). The results were analyzed using the generalized partial credit model as an IRT model. It was found that 4 out of 8 item categories of the HAQ-DI displayed substantial differential item functioning (DIF) over the three diseases. Further, it was shown that this DIF could be modeled using an IRT model with disease-specific item parameters, which produces measures that are comparable for the three diseases. Although the HAQ-DI partially functioned differently in the three disease groups, the measurement regarding the disability level of the patients can be made comparable using IRT methods.
Content may be subject to copyright.
Application of the health assessment questionnaire disability index
to various rheumatic diseases
Maaike M. van Groen Peter M. ten Klooster
Erik Taal Mart A. F. J. van de Laar
Cees A. W. Glas
Accepted: 6 June 2010 / Published online: 18 June 2010
The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract
Purpose To investigate whether the Stanford Health
Assessment Questionnaire Disability Index (HAQ-DI) can
serve as a generic instrument for measuring disability
across different rheumatic diseases and to propose a scor-
ing method based on item response theory (IRT) modeling
to support this goal.
Methods The HAQ-DI was administered to a cross-
sectional sample of patients with confirmed rheumatoid
arthritis (n=619), osteoarthritis (n=125), or gout (n=
102). The results were analyzed using the generalized
partial credit model as an IRT model.
Results It was found that 4 out of 8 item categories of the
HAQ-DI displayed substantial differential item functioning
(DIF) over the three diseases. Further, it was shown that
this DIF could be modeled using an IRT model with dis-
ease-specific item parameters, which produces measures
that are comparable for the three diseases.
Conclusion Although the HAQ-DI partially functioned
differently in the three disease groups, the measurement
regarding the disability level of the patients can be made
comparable using IRT methods.
Keywords Rheumatoid arthritis Osteoarthritis
Gout Health-related quality of life Item response theory
Differential item functioning
Abbreviations
DIF Differential item functioning
HAQ-DI Health Assessment Questionnaire Disability
Index
IRT Item response theory
LM Lagrange multiplier
OA Osteoarthritis
PF Physical functioning scale
PsA Psoriatic arthritis
RA Rheumatoid arthritis
SF-36 Medical Outcomes Study 36-Item Short Form
Introduction
Besides the traditional use of physical and biochemical
measures, patient-centered outcomes have become more and
more important as outcome measures of interventions [1].
For example, patient-reported disability has become a
standard outcome in the clinical studies of rheumatic dis-
eases. One of the most widely used self-reported measures of
physical disability is the Stanford Health Assessment
Questionnaire Disability Index (HAQ-DI) [2]. Although
often referred to as a disease-specific measure, it assesses
physical disability in general and does not focus on specific
disease-associated impairments. In fact, according to its
developers, it was originally intended for use in multiple
illnesses so that the impact of different disease processes
could be compared [1,3]. As a result, the scale has been used
across a wide range of general and clinical populations.
M. M. van Groen P. M. ten Klooster (&)E. Taal
M. A. F. J. van de Laar C. A. W. Glas
Institute for Behavioral Research, Faculty of Behavioral
Sciences, University of Twente, PO Box 217, 7500 AE
Enschede, The Netherlands
e-mail: P.M.tenKlooster@utwente.nl
M. A. F. J. van de Laar
Department of Rheumatology, Medisch Spectrum Twente,
PO Box 50.000, 7500 KA Enschede, The Netherlands
123
Qual Life Res (2010) 19:1255–1263
DOI 10.1007/s11136-010-9690-9
Especially in the field of rheumatology, the HAQ-DI has
become the measure of choice for assessing physical dis-
ability in several specific rheumatic diseases. Although
physical disability is common among all musculoskeletal
conditions, rheumatic diseases can vary widely in their
underlying disease mechanisms, clinical manifestations,
progress and severity, and composition of the populations
generally affected. All of which may influence the mea-
surement characteristics and resulting disability scores
across diseases. Nonetheless, mean HAQ-DI scores are
frequently used to directly compare the severity of dis-
ability across different rheumatic diseases, whether or not
adjusted for some known covariates [49]. The purpose of
the current study is to investigate whether the HAQ-DI is a
generic instrument indeed, and if this proves problematic,
to model response behavior on disease-specific items of the
instrument in such a way that the measurement results are
comparable over different groups of rheumatic patients.
The construct validity of the HAQ-DI has been previ-
ously established in numerous studies [1], mostly using
classical psychometric techniques such as factor analysis.
Cole et al. (2005, 2006), for instance, show that there is
considerable support for a single-factor structure and for
comparability of scores of patients with systemic sclerosis
and patients with rheumatoid arthritis. However, some of
the results of these analyses, such as the presence of cor-
related residuals, invite further attention. In the present
article, construct validity is investigated using a unidi-
mensional item response theory (IRT) model. The relation
between IRT modeling and factor-analytic approaches will
be returned to the discussion section.
In IRT models, observed responses are related to a
unidimensional latent trait, that is, to some underlying
scale. The unidimensional latent scale of the HAQ-DI
pertains to the disability level of the patients. The observed
responses are explained by the persons’ disability param-
eters and by item parameters related to the probability that
a person with a certain disability parameter endorses an
item. One of the common assumptions of IRT is mea-
surement invariance, that is, the latent scale applies to all
respondents from some population and items have the
same measurement characteristics, that is, the same item
parameters, for these respondents. A violation of these two
assumptions is known as differential item functioning
(DIF). An item shows DIF if the probability of responding
in the different categories of the item varies across groups
of patients with the same disability level [10,11]. In other
words, an item is biased if the observed item score, con-
ditional on the latent disability level of the patients, differs
between subgroups [12]. In the current study, the construct
validity of the HAQ-DI is investigated by assessing DIF for
patients with three different types of arthritis.
DIF is often investigated using the generalized partial
credit model as an IRT model [11]. The generalized partial
credit model [13] applies to polytomously scored items,
such as the items of the HAQ-DI. The probability of a score
in category xof item iis given by the item response curve
PðXni ¼xjhÞ¼
exp P
x
j¼1
aihndij

"#
1þP
mi
r¼1
exp P
r
j¼1
aihndij

"#
;
where h
n
is the latent disability level of patient n. In the
model, m
i
denotes the number of item categories. Further,
d
ij
and a
i
are item parameters. d
ij
is a category intersection
parameter, that is, it is the point in which the probability of
responding in category j-1is equal to the probability of
responding category j. Finally, a
i
is a discrimination
parameter that indicates the extent to which the item
response is related to the latent scale. This discrimination
parameter is comparable to a factor loading in a factor
analysis model.
If DIF is not present, this is unambiguous support for the
construct validity of the instrument. If DIF is present,
however, the type of DIF becomes important. As previ-
ously noted, measurement invariance pertains to the pres-
ence of the same latent variable in all subgroups and
constancy of item parameters over subgroups. If only the
latter assumption is violated by a limited number of items,
comparability can often still be realized and construct
validity may still be defendable. For example, a question
regarding the number of cars in the household may be a
good item for measuring the latent variable Wealth, though
the metric in downtown New York and in Texas may be
quite different. In IRT, such differences can be modeled by
group-specific item parameters. This approach is, of
course, only defendable if it can be explicitly shown that
the responses to the items given in the two groups pertain
to the same latent variable, that is, that it can be shown that
the same IRT model holds for the entire set of response
data. This approach to modeling DIF, which has a con-
siderable tradition in educational measurement [1417] and
in consumer research [18], will also be applied in the
present study.
Patients and methods
Respondents for this study were recruited during several
waves of data collection in the period between 2005 and
2008 at the outpatient rheumatology clinic of the Medisch
Spectrum Twente hospital in Enschede, the Netherlands.
During data collection days, consecutive patients visiting
1256 Qual Life Res (2010) 19:1255–1263
123
the outpatient clinic were asked to participate. As the study
did not interfere with usual treatment, ethical approval was
not required according to national legislation and local
institutional policy.
In total, 846 patients with physician-confirmed rheu-
matoid arthritis (RA), osteoarthritis (OA), or gout agreed to
participate. Of the included patients, 619 patients were
treated for RA, 125 for OA, and 102 for gout. Table 1gives
a number of characteristics of the sample. The majority of
the patients were women, but as would be expected, the
gout sample consisted of only 18% women. Mean age was
62 with a standard deviation of 13.6 years. A validated
Dutch version of the HAQ was used [19]. The average
scores on the Medical Outcomes Study 36-Item Short Form
(SF-36) health survey [20] were reasonably comparable
across the three conditions. HAQ-DI scores were similar
for patients with RA and OA, whereas patients with gout
reported substantially less disability.
Scoring the HAQ-DI
The Health Assessment Questionnaire Disability Index
(HAQ-DI) consists of 20 questions regarding the limita-
tions patients experience in performing daily physical
activities [2]. Patients are asked how difficult it is to per-
form an activity on a scale of 0 (without any difficulty) to 3
(unable to do). Patients are also asked whether they need
assistance or aids for the activity.
The questions of the HAQ-DI are ordered into eight
categories of daily living, covering Rising, Walking,
Dressing and grooming, Reach, Eating, Grip, Activities,
and Hygiene. The highest item score within a category is
used as the score for this category, essentially reducing the
HAQ-DI to an 8-item scale. If a respondent indicates the
use of assistance or aids for a category and his highest item
score within the category is 0 or 1, the category score is
raised to the value 2. The scores on the categories are
averaged to construct a single total score.
Statistical analysis
The scores on the eight categories of the HAQ-DI were
used as input for the statistical analysis. The item param-
eters and the means and variances of the latent person
parameters were estimated by marginal maximum likeli-
hood, and DIF was examined using Lagrange multiplier
(LM) statistics [21]. To compute these statistics, the sam-
ple of respondents is divided into subgroups labeled g=
1,,G. For the present application, these are the three
disease groups, that is, G=3. The statistic is based on the
difference between average observed scores on every item i
in the subgroups, that is, Sig ¼1
NgPNg
njgXni (where the
summation is over the N
g
respondents in subgroup g), and
their expectations E(S
ig
). The differences are squared and
divided by their covariance matrix (for the exact expres-
sions, refer to Glas [15]). The hypothesis thus tested is
equivalent to testing the hypothesis that the parameters of
the items are equal for the subgroups. The LM statistic has
an asymptotic chi-square distribution with G-1degrees
of freedom. Below, the statistics will be accompanied by
effect sizes dig ¼maxgSig EðSigÞ
, which show the
seriousness of the model violation. Since the effect sizes d
ig
are on a scale ranging from 0 to the maximum score m
i
,
effect sizes d
ig
\0.10 can be considered indicative of
minor, acceptable model violation. It can be noted that this
cut-off point is somewhat arbitrary, but its effectiveness
can be evaluated from whether enough DIF items are
detected and modeled to obtain a fitting overall model.
When items with DIF are identified, the next step is
trying to model the DIF in such a way that the measures
Table 1 Sample characteristics
RA Rheumatoid arthritis, OA
Osteoarthritis, HAQ-DI Health
Assessment Questionnaire
Disability Index, SF-36 Medical
Outcomes Study 36-Item Short
Form (version 2), PCS Physical
component summary, MCS
Mental component summary
Characteristics Total sample RA sample OA sample Gout sample
N846 619 125 102
Gender (%)
Female 64 69 79 18
Male 36 31 21 82
Age (years)
Mean (SD) 62 (13.6) 62 (14.2) 63 (11.5) 62 (12.6)
Disease duration (years)
Mean (SD) 13 (12.8) 13 (12.4) 14 (13.8) 10 (13.3)
HAQ-DI (range 0–3)
Mean (SD) 0.82 (0.7) 0.97 (0.7) 1.00 (0.65) 0.54 (0.67)
SF-36 PCS (range 0–100)
Mean (SD) 40 (8.4) 39 (8.1) 38 (8.3) 43 (9.2)
SF-36 MCS (range 0–100)
Mean (SD) 39 (7.0) 40 (6.8) 39 (7.3) 38 (7.0)
Qual Life Res (2010) 19:1255–1263 1257
123
obtained in the subgroups are still comparable. To this end,
DIF can be modeled by assigning these items disease-
specific parameters within a generalized partial credit
model that still pertains to all respondents. So it is assumed
that the same construct is measured in all subgroups, but
for some subgroups the item locations on the latent scale
are different. In this study, this was done in an iterative
procedure in which the item with the largest significant LM
test was given disease-specific item parameters (for more
information on this procedure, see Glas and Verhelst [16]).
These iteration steps were repeated until no items were left
with significant LM tests (P\0.01) or when the effect size
was below the set cut-off point (d
ig
\0.10). The results of
these iteration steps are presented here as results of an
analysis consisting of two steps to enhance clarity.
The final step in the statistical analyses was to assert that
the resulting model was valid in all disease groups, that is,
to assert that the same latent scale with disease-specific
item parameters for some of the items was applicable in all
disease groups. This was done again by computing LM
statistics: one targeted at the form of the item response
curves and one targeted at the assumption of local inde-
pendence. The latter assumption implies that item respon-
ses are independent given a person’s value of on the latent
variable. If this would not be the case, other, unaccounted,
variables influence response behavior and unidimension-
ality is violated.
Results
The results of the DIF analysis before modeling for pres-
ence of DIF are given in Table 2. Three items showed DIF
according to the criteria defined earlier: Dressing and
grooming, Reach, and Activities. The LM statistics of those
items were significant, and their effect sizes were larger
than 0.10. The item Dressing and grooming was given
disease-specific item parameters first. In a second analysis,
the Activities item showed DIF. The process was repeated
until in the third analysis four items were given disease-
specific item parameters. The resulting item parameters are
shown in Table 3. It is important to note that the significant
items in Table 2(Walking, Dressing and grooming, Reach,
Activities) are not completely analogous to the items in
Table 3(Walking, Dressing and grooming, Eating, Activ-
ities). The reason is that the presence of DIF items biases
the item parameter estimates of all items, both the items
with and without DIF. This motivates the iterative nature of
the procedure where items are processed one at a time.
To clarify the interpretation of the results in the table,
the item probabilities for OA and gout are illustrated in
Fig. 1. Consider the item Dressing and grooming. The
discrimination indices (under the heading a
i
in Table 3)
show that the item has the highest loading on the latent
dimension for the patients with gout and the lowest for the
patients with OA. Further, in Table 3and Fig. 1, it can be
seen that the category intersection parameters d
ij
are higher
for the patients with gout than for the patients with OA.
This means that the expected score on Dressing and
Table 2 Outcomes of tests for DIF
HAQ-DI categories LM PAbs. DIF
Rising 3.33 0.19 0.03
Walking 25.02 0.00 0.09
Dressing and grooming 71.20 0.00 0.17
Reach 37.63 0.00 0.15
Eating 6.68 0.04 0.06
Grip 2.66 0.26 0.03
Activities 51.52 0.00 0.15
Hygiene 7.29 0.03 0.06
DIF Differential item functioning, HAQ-DI Health Assessment
Questionnaire Disability Index, LM Outcome Lagrange multiplier
test, Abs. DIF Amount of absolute DIF d
ig
Degrees of freedom =2
Table 3 Item parameters after modeling DIF
HAQ-DI categories Item parameters
aid
i1
d
i2
d
i3
Rising 3.758 -0.089 3.777 4.660
Walking
RA 3.253 0.429 3.568 6.534
OA 3.691 -0.515 3.878 6.875
Gout 3.987 0.196 3.844 9.377
Dressing and grooming
RA 3.285 -0.428 2.124 3.905
OA 2.532 0.101 2.885 3.666
Gout 3.850 3.066 4.832 4.994
Reach 2.671 -0.004 2.306 3.499
Eating
RA 2.629 -0.883 1.982 1.861
OA 3.077 -0.299 2.828 2.585
Gout 2.464 -0.079 2.786 1.673
Grip 3.824 -0.149 3.176 4.455
Activities
RA 2.915 -1.234 2.159 3.205
OA 2.431 -0.388 2.070 4.226
Gout 3.124 1.713 3.829 4.172
Hygiene 3.768 -1.557 2.089 3.242
DIF Differential item functioning, HAQ-DI Health Assessment
Questionnaire Disability Index, a
i
Discrimination parameter, d
i1
,d
i2
,
and d
i3
Category intersection parameters, RA Rheumatoid arthritis,
OA Osteoarthritis
Log likelihood =-5941.921
1258 Qual Life Res (2010) 19:1255–1263
123
grooming given a certain disability level is higher for the
patients with OA than for the patients with gout. That is,
patients with OA endorse this item more and the item
Dressing and grooming is more difficult for them than for
the patients with gout.
The next question addressed is whether the scale pre-
sented in Table 3actually fits the data. This was investi-
gated using two LM statistics [21], one targeted at the form
of the item response curves and one targeted at the
assumption of local independence. The first statistic is
defined analogous to the statistic for DIF, only this time the
subgroups are total-score level groups within the disease
groups. The observed total score is the sum score of the
responses on all items except the item targeted. Glas [21]
demonstrated that this statistic pertains to the hypothesis
that the response probabilities as a function of the latent
disability parameters are as predicted by the model. Within
the three disease groups, three total-score level groups were
formed in such a way that the numbers of respondents in
each group were approximately the same. The ranges of the
scores in the total-score level groups are given at the bot-
tom of the table. The results for the patients with RA are
shown in Table 4. The results for the two other diseases
were analogous. Note that none of the outcomes of the LM
tests were below the significance level of 1%. The last
column gives the effect sizes d
ig
. The highest effect size
was 0.05, which was well below the set criterion of 0.10.
The overall conclusion is that the model fitted very well,
and the hypothesis that the same latent scale pertained to
the three diseases was not rejected.
The second test pertained to local independence. The
test is also sensitive to violations of unidimensionality. The
test targets the dependence between responses on pairs of
items. In the present case, responses to consecutive items
were evaluated, but this choice is not essential. The test
statistic is based on the evaluation of the average scores on
some item given the scores on some other item. With this
alternative definition of score groups, the test statistic is
defined analogous to other LM statistics. The results for the
patients with RA are displayed in Table 5. As with the tests
for DIF and the form of the item response curves, the
results for the two other diseases were analogous and none
of the outcomes of the LM tests were below the signifi-
cance level of 1%. The columns labeled ‘0’ to ‘3’ give the
observed and expected average scores on some item igiven
that the score on item i-1was ‘0’ to ‘3’, respectively. That
is, the average score on the item i=2, i.e., Walking, for
patients scoring ‘0’ on item i-1, i.e., Rising, was 0.18.
The associated expected score was 0.23. The last column
gives the effect sizes d
ig
. The highest effect size was 0.10,
which just attained the criterion of 0.10. So again, the
predictions by the model were quite acceptable.
More easy
to perform
Walki ng
Dressing and grooming
Walking
Activities
Rising Rising
Grip Grip
Dressing and grooming
Activities
hcaeRhcaeR
Eating
Eating
Hyg Henei ygiene
More difficult
to perform
1
3
4
5
Osteoarthritis Gout
2
Fig. 1 An illustration of the item difficulty locations (average of the
3 category intersection parameters) of the Health Assessment
Questionnaire Disability Index on the IRT latent scale in patients
with osteoarthritis and gout
Table 4 Outcomes of tests for model fit in score level groups for
patients with RA
HAQ-DI
categories
Total-score level groups
Level 1 Level 2 Level 3 d
LM PObs. Exp. Obs. Obs. Obs. Exp.
Rising 2.77 0.25 0.21 0.22 0.81 0.77 1.56 1.51 0.03
Walking 7.26 0.03 0.13 0.16 0.68 0.62 1.20 1.22 0.04
Dressing and
grooming
0.22 0.89 0.32 0.32 1.03 1.02 1.84 1.84 0.01
Reach 7.05 0.03 0.25 0.27 0.76 0.83 1.55 1.58 0.04
Eating 1.53 0.47 0.47 0.47 1.22 1.19 2.12 2.12 0.01
Grip 4.63 0.10 0.20 0.23 0.88 0.83 1.68 1.67 0.03
Activities 4.70 0.10 0.53 0.51 1.05 1.12 1.90 1.86 0.05
Hygiene 0.70 0.71 0.50 0.50 1.25 1.27 2.21 2.19 0.01
RA Rheumatoid arthritis, HAQ-DI Health Assessment Questionnaire
Disability Index, LM outcome Lagrange multiplier test, Obs.
Observed scores, Exp. Expected scores by the model, The observed
total score is the sum score of the responses on all items, deffect size
Level 1: total scores 0–4, Level 2: total scores 5–8, Level 3: total
scores 9–23
Qual Life Res (2010) 19:1255–1263 1259
123
The last question addressed concerns the impact of the
DIF for inferences concerning differences between the
three diseases on the latent scale. As mentioned previously,
the item parameters and the means and variances of the
latent person parameters were estimated by marginal
maximum likelihood. The obtained mean values of dis-
ability for each disease are presented in Fig. 2, together
with 99% confidence intervals. The mean for the patients
with gout was set equal to zero to identify the latent scale.
Note that average disability of the respondents for each
disease decreases after the introduction of disease-specific
item parameters. Patients with OA had the highest average
disability level in all analyses. Patients with gout had the
lowest disability. From the confidence intervals, it can be
inferred that conclusions from statistical tests would not
change. However, after modeling DIF, absolute score dif-
ferences clearly decreased.
Discussion
An item response theory (IRT)-based method is presented
that can be used to make HAQ-DI disability scores better
comparable across different rheumatic diseases, and the
results of the application of this method suggest that the
HAQ-DI can function as a generic instrument.
By now, there is extensive literature on the evaluation of
construct validity using factor analyses and IRT analyses
Table 5 Outcomes of tests for local independence for patients with RA: score level given the score level on the previous item
Category LM PScore on previous category
0123d
Obs. Exp. Obs. Exp. Obs. Exp. Obs. Exp.
2 8.03 0.05 0.18 0.23 0.80 0.76 1.28 1.26 1.61 1.70 0.09
3 3.65 0.30 0.53 0.51 1.28 1.30 1.91 1.97 2.83 2.79 0.06
4 2.25 0.52 0.37 0.34 0.81 0.83 1.35 1.40 1.99 2.04 0.05
5 2.50 0.48 0.61 0.64 1.38 1.34 1.95 1.95 2.57 2.49 0.06
6 7.69 0.05 0.22 0.26 0.86 0.80 1.27 1.25 1.78 1.84 0.06
7 0.34 0.95 0.57 0.57 1.18 1.19 1.82 1.82 2.40 2.30 0.10
8 1.01 0.80 0.50 0.47 1.17 1.18 1.84 1.85 2.45 2.46 0.03
RA Rheumatoid arthritis, The categories are numbered in the order as they appear in Table 4,LM outcome Lagrange multiplier test, Obs.
Observed scores, Exp. Expected scores by the model, The observed total score is the sum score of the responses on all items, deffect size
Fig. 2 Means of IRT disability
estimates (y-axis) in rheumatoid
arthritis (left panel) and
osteoarthritis (right panel)in
three analyses (x-axis) labeled
0, 1, and 2. The mean for gout
was set equal to zero to identify
the latent scale. Analysis 0 was
the initial analysis. In analyses 1
and 2, 2 and 4 items with
disease-specific item parameters
were introduced, respectively
1260 Qual Life Res (2010) 19:1255–1263
123
[22]. It is important to note that these two classes of models
are closely related. In fact, Takane and de Leeuw have
shown that under quite general assumptions, these two
models are equivalent [23]. Only the traditions of statistical
inference are different: factor analysis is usually based on a
covariance matrix, while IRT analysis is based on the
complete response patterns. This motivates the term ‘‘full-
information factor analysis’’ used for multidimensional
IRT by Bock, Gibbons, and Muraki [24]. Both approaches
have their advantages and disadvantages. One of the
advantages of the IRT approach is that it uses more
information in the data and, therefore, assumptions such as
the form of the item response curves and local indepen-
dence can be investigated. However, the results obtained
using both approaches are closely related. In that sense, the
correlated residuals reported by Cole [25,26] can be
interpreted as an indication for lack of local independence
and multidimensionality, which can be further investigated
in detail using IRT-based techniques. Although both factor
analysis and IRT can be used to assess the construct
validity of the HAQ-DI, it is important to note that con-
struct validity is not so much a property of an instrument,
but a property of inferences made using the instrument
[27]. In the present study, it was shown that when a number
of disease-specific item parameters are used and the HAQ-
DI is scored using h-estimates, these h-estimates relate to
the same unidimensional scale. Therefore, these scores can
support the construct validity of the HAQ-DI for inferences
across diseases.
IRT methods offer a sophisticated and robust means to
test the generic nature of an instrument by examining
whether the underlying latent scale is the same for different
groups of individuals. This can be evaluated by examining
whether the questionnaire contains items with differential
item functioning (DIF), i.e., items where the probability of
scoring in the various response categories differs between
subgroups of patients after controlling for the general dis-
ability level as estimated by the IRT model. Although IRT-
based approaches to DIF detection have been increasingly
used in health outcomes assessment, research addressing
the measurement equivalence of disability scales across
different (rheumatic) diseases is still scarce. Only one
recent study was found that examined DIF for the HAQ-DI
and the 10-item physical functioning scale (PF) of the
SF-36 between patients with RA and psoriatic arthritis
(PsA) using Rasch analysis [28]. This study found evidence
of marked DIF for three HAQ-DI items, similar to our
study, and relatively minor DIF for the SF-36 PF scale. The
authors concluded that the SF-36 PF scale is a better
instrument than the HAQ-DI for comparing disability from
PsA with disability from other diseases. However, the
study did not evaluate the impact of DIF on individual
items for inferences about total HAQ-DI score differences
between the diseases or provide guidelines on how to deal
with this DIF. Therefore, the objective of this study was
twofold: first to investigate whether the HAQ-DI functions
as a generic measure of disability across different rheu-
matic diseases by evaluating DIF and second, if not, to illus-
trate the use of IRT methods to model DIF so that disability
scores can be made comparable across diseases. For this
purpose, we used data from three common rheumatic dis-
eases with known differences in disease characteristics:
rheumatoid arthritis (RA), osteoarthritis (OA), and gout.
As would be expected, the majority of the patients with
RA and OA were women, whereas patients with gout
patients were predominantly men. Mean SF-36 physical
and mental component scores were well below the average
of 50 in the general population, suggesting that all three
diseases have a substantial impact on general health status.
Whereas disability scores between RA and OA were very
similar, mean HAQ-DI scores were clearly lower for
patients with gout and in close correspondence to a recently
reported mean HAQ-DI score of 0.59 in a cross-sectional
gout sample [29].
However, half of the HAQ-DI items displayed sub-
stantial DIF between the three diseases, possibly biasing
total score differences between the diseases. After modeling
these items by assigning them disease-specific parameters,
statistical conclusions regarding disability differences
across the 3 conditions did not change. Patients with OA
and RA still displayed higher disability scores than patients
with gout. However, absolute differences between the dis-
eases were attenuated. HAQ-DI scores based on disease-
specific item parameters fitted the data very well and
resulted in an underlying latent scale that applied to all three
diseases.
An important concern, however, is that only four items
served as anchors across the three diseases, and these items
appear to be on the ‘‘more difficult’’ end of the scale. To
minimize the standard errors of differences between dis-
ability estimates in the different disease groups, anchoring
should be preferably done in all sections of a scale. Often,
this cannot be achieved, but it should be kept in mind that
the precision of the method deteriorates with the number of
anchor items and their position on the scale. The authors do
not recommend using the method when the anchor is very
small (e.g., less than 4 items or less than 50% of the items).
It is also important to emphasize here that the present
study focused only on cross-sectional samples of patients
with OA, RA, and gout as an example for evaluating the
generic nature of the HAQ-DI. It is very well possible that
these or other items of the HAQ-DI may show DIF, pos-
sibly to a different extent, between other rheumatic con-
ditions, non-rheumatic conditions, or general population
samples. Accordingly, researchers using the HAQ-DI to
compare disability between different subgroups are
Qual Life Res (2010) 19:1255–1263 1261
123
encouraged to examine DIF before comparing total HAQ-
DI scores. The present study provides an example of how
IRT methods can be used to evaluate DIF and, if necessary,
how to model this DIF to obtain more accurate disability
estimates.
Furthermore, all analyses presented in this study are
based on so-called standard scores of the HAQ-DI, which
take into account the use of aids and devices or assistance
from another person [1,3]. Although this scoring method is
most frequently used and recommended [30], some clinical
investigations have used an alternative scoring without this
correction. Secondary analysis using the alternative scoring
method in this study showed that the IRT results obtained
with and without correction were very similar.
In summary, the results of this study showed that 4 out
of the 8 disability items displayed substantial DIF across
the 3 diseases, indicating that the HAQ-DI may not fully
function as a generic instrument for the assessment of
disability across different rheumatic diseases unless DIF is
modeled and adjustments to the scoring method are made.
Open Access This article is distributed under the terms of the
Creative Commons Attribution Noncommercial License which per-
mits any noncommercial use, distribution, and reproduction in any
medium, provided the original author(s) and source are credited.
References
1. Bruce, B., & Fries, J. F. (2003). The Stanford health assessment
questionnaire: A review of its history, issues, progress, and
documentation. Journal of Rheumatology, 30(1), 167–178.
2. Fries, J. F., Spitz, P., Kraines, R. G., & Holman, H. R. (1980).
Measurement of patient outcome in arthritis. Arthritis and
Rheumatism, 23(2), 137–145.
3. Bruce, B., & Fries, J. F. (2005). The health assessment ques-
tionnaire (HAQ). Clinical and Experimental Rheumatology, 23(5
Suppl 39), S14–S18.
4. Husted, J. A., Gladman, D. D., Farewell, V. T., & Cook, R. J.
(2001). Health-related quality of life of patients with psoriatic
arthritis: a comparison with patients with rheumatoid arthritis.
Arthritis and Rheumatism, 45(2), 151–158.
5. Johnson, S. R., Glaman, D. D., Schentag, C. T., & Lee, P. (2006).
Quality of life and functional status in systemic sclerosis com-
pared to other rheumatic diseases. Journal of Rheumatology,
33(6), 1117–1122.
6. Lindqvist, U. R., Alenius, G. M., Husmark, T., Theander, E.,
Holmstrom, G., & Larsson, P. T. (2008). The Swedish early pso-
riatic arthritis register–2-year followup: a comparison with early
rheumatoid arthritis. Journal of Rheumatology, 35(4), 668–673.
7. Martinez, J. E., Ferraz, M. B., Sato, E. I., & Atra, E. (1995).
Fibromyalgia versus rheumatoid arthritis: A longitudinal com-
parison of the quality of life. Journal of Rheumatology, 22(2),
270–274.
8. Slatkowsky-Christensen, B., Mowinckel, P., Loge, J. H., &
Kvien, T. K. (2007). Health-related quality of life in women with
symptomatic hand osteoarthritis: a comparison with rheumatoid
arthritis patients, healthy controls, and normative data. Arthritis
& Rheumatism (Arthritis Care & Research), 57(8), 1404–1409.
9. Sokoll, K. B., & Helliwell, P. S. (2001). Comparison of disability
and quality of life in rheumatoid and psoriatic arthritis. Journal of
Rheumatology, 28(8), 1842–1846.
10. Camilli, G., & Sheppard, L. A. (1994). Methods for identifying
biased test items. Thousand Oaks, CA: Sage.
11. Holland, P. W., & Wainer, H. (1994). Differential item func-
tioning. Hillsdale, NJ: Erlbaum.
12. Chang, H. H., & Mazzeo, J. (1994). The unique correspondence
of the item response function and item category response func-
tions in polytomously scored item response models. Psychomet-
rika, 59(3), 391–404.
13. Muraki, E. (1992). A generalized partial credit model: Applica-
tion of an EM algorithm. Applied Psychological Measurement,
16(2), 159–176.
14. Gebhardt, E., & Adams, R. J. (2007). The influence of equating
methodology on reported trends in PISA. Journal of Applied
Measurement, 8(3), 305–322.
15. Glas, C. A. W. (1998). Detection of differential item functioning
using Lagrange multiplier tests. Statistica Sinica, 8(3), 647–667.
16. Glas, C. A. W., & Verhelst, N. D. (1995). Testing the Rasch
model. In G. H. Fischer & I. W. Molenaar (Eds.), Rasch models:
Foundations, recent developments, and applications (pp. 69–96).
New York: Springer.
17. Grisay, A., de Jong, J. H., Gebhardt, E., Berezner, A., & Halleux-
Monseur, B. (2007). Translation equivalence across PISA coun-
tries. Journal of Applied Measurement, 8(3), 249–266.
18. de Jong, M. G., Steenkamp, J. B. E. M., & Fox, J. P. (2007).
Relaxing measurement invariance in cross-national consumer
research using a hierarchical IRT model. Journal of consumer
research, 34(2), 260–278.
19. ten Klooster, P. M., Taal, E., & van de Laar, M. A. (2008). Rasch
analysis of the Dutch health assessment questionnaire disability
index and the health assessment questionnaire II in patients with
rheumatoid arthritis. Arthritis & Rheumatism (Arthritis Care &
Research), 59(12), 1721–1728.
20. Ware, J. E., Kosinski, M., & Dewey, J. E. (2000). How to score
version 2 of the SF-36 health survey. Lincoln, RI: Quality Metric
Incorporated.
21. Glas, C. A. W. (1999). Modification indices for the 2-PL and the
nominal response model. Psychometrika, 64(3), 273–294.
22. Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis
of the measurement invariance literature: Suggestions, practices,
and recommendations for organizational research. Organizational
Research Methods, 3(1), 4–70.
23. Takane, Y., & de Leeuw, J. (1987). On the relationship between
item response theory and factor analysis of discredited variables.
Psychometrika, 52(3), 393–408.
24. Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-Information
item factor analysis. Applied Psychological Measurement, 12(3),
261–280.
25. Cole, J. C., Khanna, D., Clements, P. J., Seibold, J. R., Tashkin,
D. P., Paulus, H. E., et al. (2006). Single-factor scoring validation
for the health assessment questionnaire-disability index (HAQ-
DI) in patients with systemic sclerosis and comparison with early
rheumatoid arthritis patients. Quality of Life Research, 15(8),
1383–1394.
26. Cole, J. C., Motivala, S. J., Khanna, D., Lee, J. Y., Paulus, H. E.,
& Irwin, M. R. (2005). Validation of single-factor structure and
scoring protocol for the health assessment questionnaire-disabil-
ity index. Arthritis & Rheumatism (Arthritis Care & Research),
53(4), 536–542.
27. Sireci, S. G. (2009). Packing and unpacking sources of validity
evidence: History repeats itself again. In R. W. Lissitz (Ed.), The
concept of validity: Revisions, new directions, and applications
(pp. 19–37). Charlotte, NC: Information Age Publishing.
1262 Qual Life Res (2010) 19:1255–1263
123
28. Taylor, W. J., & McPherson, K. M. (2007). Using Rasch analysis
to compare the psychometric properties of the short form 36
physical function score and the health assessment questionnaire
disability index in patients with psoriatic arthritis and rheumatoid
arthritis. Arthritis & Rheumatism (Arthritis Care & Research),
57(5), 723–729.
29. Alvarez-Hernandez, E., Pelaez-Ballestas, I., Vazquez-Mellado, J.,
Teran-Estrada, L., Bernard-Medina, A. G., Espinoza, J., et al.
(2008). Validation of the health assessment questionnaire dis-
ability index in patients with gout. Arthritis & Rheumatism
(Arthritis Care & Research), 59(5), 665–669.
30. Zandbelt, M. M., Welsing, P. M., van Gestel, A. M., & van Riel,
P. L. (2001). Health assessment questionnaire modifications: is
standardisation needed? Annals of the Rheumatic Diseases, 60(9),
841–845.
Qual Life Res (2010) 19:1255–1263 1263
123
... A variety of generic and disease-specific instruments with proven validity and reliability have been employed to assess HRQoL in patients with RA, such as the Five-Level EuroQol Five-Dimensional Questionnaire (EQ-5D-5L), the Short Form 12-item Health Survey (SF-12), the Assessment of Quality of Life (AQoL) and the Stanford Health Assessment Questionnaire Disability Index (HAQ-DI). [10][11][12] In recent years, several studies have attempted to explore the correlation between RA disease activity and HRQoL. [13][14][15] Previous studies have shown a high consistency of results, with a negative correlation between disease activity and HRQoL. ...
Article
Full-text available
Objective We aimed to provide a comprehensive assessment of health-related quality of life (HRQoL) in patients with rheumatoid arthritis (RA) of different activities and to evaluate the correlation between clinical activity measures and HRQoL instruments. This research also analysed the extent to which different aspects of HRQoL (physical, psychological and social) were affected. Design Cross-sectional, observational, non-interventional study. Setting The study was conducted at the Department of Rheumatology and Immunology, Qilu Hospital, Shandong University. Methods From December 2019 to October 2020, a total of 340 RA patients participated in the survey using convenient sampling. Three generic instruments, EQ-5D-5L,SF-12 and the AQoL-4D, as well as an RA-specific instrument,the Stanford Health Assessment Questionnaire Disability Index (HAQ-DI), were administered to assess patients’ HRQoL. The Disease Activity Score 28-Erythrocyte Sedimentation Rate (DAS28-ESR) was used by doctors to measure patients’ clinical activity. Multivariable linear regression was used to compare patients’ HRQoL across different levels of activity. Spearman’s correlation was used to assess the correlation between doctor-reported clinical activity and HRQoL. Results A total of 314 patients with RA participated in this study. The mean score of HAQ-DI was 0.87 (SD: 0.91). Using patients in the clinical remission group as a reference, patients in the moderate and high disease activity groups showed significantly reduced health state utility values and HRQoL scores (all p<0.05). On the contrary, there was an increase in HAQ-DI scores, indicating more impairment (p<0.05). All instruments included in the study tended to differentiate disease activity based on multiple criteria, with scores showing a moderate to strong correlation with RA activity (|r s |=0.50 to 0.65). Among them, the disease-specific instrument had the highest correlation. Conclusions RA can have considerable impairment on patients’ HRQoL, both in terms of physical and psychosocial functioning. Given the strong correlation between clinical activity and HRQoL scores, and the fact that HRQoL can be an important clinical supplement. The EQ-5D-5L is probably the most appropriate generic measurement instrument for measuring HRQoL in RA patients.
... Gout is caused by hyperuricemia with resultant monosodium urate crystal deposition and development of tophi in peripheral joints and soft tissues [2]. This disease has substantial impacts on physical function, with persons living with gout experiencing severe pain and distress from arthritic flares [2], multiple types of disabilities [3,4], and increased occurrence of systemic comorbidities, including hypertension [5][6][7], chronic kidney disease [5,[7][8][9], and cardiovascular disease [5,8,10,11]. Additional impacts include decreased work productivity [12], increased health care costs [12], and decreased quality of life (QoL) [13][14][15]. ...
Article
Full-text available
This study aimed to characterize patient-reported outcomes from social media conversations in the gout community. The impact of management strategy differences on the community’s emotional states was explored. We analyzed two social media sources using a variety of natural language processing techniques. We isolated conversations with a high probability of discussing disease management (score > 0.99). These conversations were stratified by management type: proactive or reactive. The polarity (positivity/negativity) of language and emotions conveyed in statements shared by community members was assessed by management type. Among the statements related to management, reactive management (e.g., urgent care) was mentioned in 0.5% of statements, and proactive management (e.g., primary care) was mentioned in 0.6% of statements. Reactive management statements had a significantly larger proportion of negative words (59%) than did proactive management statements (44%); “fear” occurred more frequently with reactive statements, whereas “trust” predominated in proactive statements. Allopurinol was the most common medication in proactive management statements, whereas reactive management had significantly higher counts of prednisone/steroid mentions. A unique aspect of examining gout-related social media conversations is the ability to better understand the intersection of clinical management and emotional impacts in the gout community. The effect of social media statements was significantly stratified by management type for gout community members, where proactive management statements were characterized by more positive language than reactive management statements. These results suggest that proactive disease management may result in more positive mental and emotional experiences in patients with gout.
... The HAQ-DI is a score ranging from 0 to 3, and the scale of the RAID questionnaire runs from 0 to 10, with higher scores indicating worse health outcomes. 17,18 These specific questionnaires were selected because they are most commonly applied in research and in clinical practice concerning RA. 19,20 Moreover, the HAQ-DI and RAID were included because they cover a broader and more RA-specific range of patient-relevant domains than what is frequently used in health care cost studies. 17,18 The difference between the RAID and HAQ-DI is that the HAQ-DI is specifically geared toward the physical functioning of patients, whereas the RAID comprises RA-specific domains that are considered relevant to the patient. ...
Article
Full-text available
Objective Economic evaluations predominantly use generic outcomes, such as the Euro Quality of Life‐5 Dimension (EQ‐5D), to assess health status. However, because of the generic nature, they are less suitable to capture the quality of life of patients with specific conditions. Given the transition to patient‐centered (remote) care delivery, this study aims to evaluate the possibility of using disease‐specific measures in a cost‐effectiveness analysis. Methods A real‐life cohort from Maasstad Hospital (2020–2021) in the Netherlands, with 772 patients with rheumatoid arthritis (RA), was used to assess the cost‐effectiveness of electronic consultations (e‐consultations) compared with face‐to‐face consultations. The Incremental Cost‐Effectiveness Ratio (ICER), based on the generic EQ‐5D, was compared with ICER's based on RA‐specific measures: the Rheumatoid Arthritis Impact of Disease (RAID) and Health Assessment Questionnaire‐Disability Index (HAQ‐DI). To compare the cost‐effectiveness of these different measures, HAQ‐DI and RAID were expressed in quality‐adjusted life‐years (QALYs) via estimated conversion equations. Results Disease‐specific patient‐reported outcome measures (PROMs) offer a promising alternative for traditional measures in economic evaluations, capturing patient‐relevant domains more comprehensively. Because PROMs are increasingly applied in clinical practice, the next step entails modeling of an RA patient‐wide conversion equation to implement PROMs in economic evaluations. Conclusion The conventional ICER (eg, EQ‐5D) indicates that e‐consultations are cost‐effective with cost savings of −€161,000 per QALY gained for a prevalent RA cohort treated in a secondary trainee hospital. RA‐specific measures show similar results, with ICERs of −€163,000 per HAQ‐DI (QALY) and −€223,000 per RAID (QALY) gained. RA‐specific measures capture patient‐relevant domains and offer the opportunity to improve the assessment and treatment of the disease impact.
... Additionally, all enrolled subjects completed the Latvian version of the Douleur Neuropathique en 4 (DN4) [17] questionnaire to assess neuropathic pain, the Generalised Anxiety Disorder-7 (GAD-7) [18] scale to assess anxiety symptoms, and the Health Assessment Questionnaire-Disability Index (HAQ-DI) [19] to assess HRQoL. Those patients scoring four or more points on the DN4 questionnaire were defined as having neuropathic pain. ...
Article
Introduction: Systemic sclerosis (SSc) is a chronic rheumatic disease that affects multiple organ systems, including the peripheral nervous system. However, studies into the involvement of polyneuropathies (PNP) have shown inconsistent results. The aim of this study was to determine the prevalence of small (SFN) and large (LFN) fibre neuropathy among SSc patients and the impact on health-related quality of life (HRQoL). Material and methods: The study enrolled 67 patients with diagnosed SSc. The severity of neuropathic symptoms was evaluated using shortened and revised total neuropathy scoring criteria. Nerve conduction studies were used for LFN, and quantitative sensory testing was used to evaluate SFN. Neuropathic pain was evaluated using a Douleur Neuropathique en 4 questionnaire, and the severity of anxiety symptoms was assessed using a Generalised Anxiety Disorder-7 scale. The Health Assessment Questionnaire-Disability Index was used to assess HRQoL. Previous data on antinuclear autoantibodies (ANA) test results was obtained. Statistical analysis was performed using SPSS software. Results: LFN was diagnosed in 47.8% (n = 32/67) and SFN in 40.3% (n = 27/67) of the subjects. ANA positivity was not associated with the presence of LFN/SFN. The severity of neuropathic pain had a significant correlation with anxiety symptoms (r = 0.61, p < 0.001), the severity of neuropathy symptoms (r = 0.51, p < 0.001) and HRQoL (r = 0.45, p < 0.001). The severity of neuropathy symptoms correlated with HRQoL (r = 0.39, p = 0.001). Conclusions: We demonstrated that PNP are found in almost all SSc patients. Also, SFN is as common as LFN. Additionally, we found that the severity of neuropathy symptoms and neuropathic pain are both associated with a worse HRQoL.
... The Health Assessment Questionnaire Disability Index (HAQ-DI) was used to quantify functional disability and physical dependence in all participants (14). HAQ-DI is a 20-item self-reported functional questionnaire that measures patients' function in eight dimensions rated from 0 to 3. In addition, a rheumatologist was evaluated RA activity in all patients with clinical and laboratory parameters including disease activity for rheumatoid arthritis score-28 (DAS-28), Hb (g/dl) and ESR (mm/h). ...
Article
Full-text available
Background: In the general geriatric population, Geriatric syndromes (GSs) predict greater likelihood of hospitalization, increased health care use and cost. The present study aimed to compare GSs among young and elderly patients with rheumatoid arthritis (RA). Methods: In a cross-sectional study a total of 98 participants, including 65 elderly (≥60 years) and 33 young adult patients (<60 years) with RA who referred to the geriatric and rheumatologic clinic were enrolled. Patients were categorized into three groups (healthy elderly, n=27; elderly with RA, n=38; and young people with RA, n=33). GSs were assessed using mini-mental state exam (MMSE), five-item geriatric depression scale-15 (GDS-15), mini nutritional assessment (MNA), and asking patients about history of falls in the past year. The RA activity in patients was assessed using disease activity for rheumatoid arthritis score-28 (DAS-28) scale, serum ESR (mm/h) level. Results: There was a statistically significant differences in terms of DAS-28 (2.23±1.01 vs. 0.64±0.97, P=0.025) and ESR (28.10±6.64 vs. 23.09±7.65 mm/h, P=0.042) between healthy elderly and RA elderly patients. Elderly patients with RA were significantly more prone to have cognitive impairment (P=0.002), fall (P=0.005), malnutrition (P<0.001), urinary incontinence (P<0.001), and functional disability (P=0.021) compared to healthy elderlies and young RA patients. The results of binary logistic regression revealed that in elderly RA patients, higher DAS-28 score [odds ratio (OR) = 1.96; 95% CI 1.03, 3.84; P=0.041] was an independent risk factors for the GSs. Conclusion: The prevalence of some features of GSs were higher in the elderly RA patients than healthy elderly and young RA patients.
... A number of nine different pain scales was used in these nine RCTs, with the Health Assessment Questionnaire (HAQ-II) having been used the most. The HAQ-II is a primary simple survey, which is a proven and reliable measure of pain, and an accurate indicator of joint pain ( Bruce & Fries, 2003 ;Niravath et al., 2019 ;van Groen et al., 2010 ). It is used to determine the development of aromatase inhibitor induced arthralgia (AIA). ...
Article
Objectives Despite the widespread use of complementary and alternative medicine by patients and physicians alike, there is no accurate evidence regarding the effects of vitamin D supplementation on treatment-induced pain in cancer patients. Thus, the aim of this systematic review of randomized controlled trials (RCTs) was to evaluate the impact of vitamin D administration on therapy-related pain in subjects diagnosed with malignant disorders. Review analysis methods We searched the Web of Science, Scopus, PubMed/Medline, Embase, and Google Scholar databases up to October 2020 to identify published RCTs that investigated the use of vitamin D in the management of treatment-induced pain in individuals with cancer. Results Nine RCTs were detected. The median duration of the intervention was of 24 weeks (range 12-52 weeks) and dose of vitamin D employed was 2000-50000 IU of vitamin D3 weekly orally each day. Six RCTs reported a significant reduction in pain, whereas three did not detect a notable decrease of this variable. Of the six studies that reported an alleviation of pain, an RCT which recruited 60 participants and lasted for 24 weeks consisted of supplementation with high doses of vitamin D2 weekly for 8 weeks in women receiving anastrozole as adjuvant therapy, then supplementation with vitamin D2 monthly for 4 months, effectively alleviated the aromatase inhibitor-associated musculoskeletal syndrome (AIMSS). The results of the same RCT also suggested a beneficial effect of vitamin D on musculoskeletal pain. Conclusions Our results suggest that the supplementation with high doses of vitamin D in cancer patients with low serum levels of vitamin D, can be effective in reducing treatment-related pain.
... Disease activity was assessed by the 28-joint Disease Activity Score calculated using CRP level (DAS28-CRP) and by Patient Global Assessment (PGA) using 100 mm visual analogous scale (VAS). Each patient's functional ability level was assessed using the Health Assessment Questionnaire Disability Index (HAQ-DI) 34,35 . Further on, 50% of subjects enrolled were treated with conventional synthetic DMARDs (csDMARDs) (Methotrexate, Leflunomide, or Sulfasalazine), 14.1% received IL-6R inhibitor Tocilizumab (TCZ), and 35.9% were on therapy with a low dose of systemic corticoids (up to 10 mg/day) due to contraindications or intolerance to DMARDs at the time of blood collection. ...
Article
Full-text available
We aimed to analyze serum pro-inflammatory profiles of female rheumatoid arthritis (RA) patients and compare them with healthy women to establish the relative importance of pro-inflammatory cytokines in RA and their relation with different treatment regimens. Levels of six cytokines were determined by ELISA assays. A supervised dimensionality reducing approach (PLS-DA Analysis) was applied. All of the cytokines assayed were significantly elevated in the sera of RA female patients than healthy controls with fold change: 21-fold for IL-6; 6.1-fold for IL-17A; 2.5-fold for IL-23; 2.3-fold for IL-18; 1.94-fold for TNF-α; 1.7-fold for IL-12p40. According to the results of the PLS-DA analysis, IL-17A, IL-18, and TNF-α were of higher importance rank compared to IL-23 and IL-12p40. Women in the early stage of RA displayed significantly elevated IL-17A levels than those with longer disease duration: 8.04 pg/ml [8.04–175.3] vs 4.64 pg/ml [2.95–13.31], p = 0.007. IL-6 serum levels were related to higher disease activity. We have demonstrated altered cytokine production within female RA patients on different treatment regimens. Those on Tocilizumab therapy showed elevated IL-6 levels and decreased IL-17A versus the rest of the patients’ subgroups. In conclusion, our data support the pivotal role of IL-18 in addition to IL-6, IL-17A, and TNF-α as the hierarchical cytokines in the pathogenesis of RA, particularly valid for women. Therapy with biological agents targeting IL-18 in addition to the Th17 axis may be an adequate approach in RA patients.
Article
Radiosynoviorthesis is approved in several European countries and the United States to treat refractory synovitis in many inflammatory joint diseases, such as rheumatoid arthritis, spondyloarthropathies, and other arthritic joint diseases. No radiopharmaceuticals for radiosynoviorthesis are currently approved in Canada. The aim of this Health Canada-approved trial was to demonstrate the safety and efficacy of radiosynoviorthesis. Methods: Between July 2012 and November 2017, we conducted a multicenter, prospective, interventional Canadian trial. Patients (n = 360) with synovitis refractory to standard treatments after failing 2 intraarticular glucocorticoid injections were included. They were followed up at 3, 6, and 12 mo. Outcome measures included adverse events (AEs) and clinical signs of synovitis (pain, swelling, and joint effusion) measured with the Health Assessment Questionnaire Disability Index, the Disease Activity Score, and the Visual Analog Scale. Results: In total, 392 joints were treated, including those reinjected after 6 mo (n = 34). Of these, 83.4% (327/392) were injected with [90Y]Y-citrate for the knees and 9.9% (39/392) with [186Re]Re-sulfide for medium-sized joints. Of the joints treated, 82.7% (324/392) were knees. Fifty-five AEs, most of them of mild grade, occurred and resolved without sequelae and were not life-threatening. The incidence of radiosynoviorthesis-related AEs was 9.4% (34/360). The proportion of patients showing an improvement in synovitis symptoms after radiosynoviorthesis was significant at 3 mo and was maintained up to 12 mo (P < 0.001). Conclusion: This study confirmed the safety of radiosynoviorthesis in the treatment of patients with synovitis refractory to standard treatments. There is evidence of sustained clinical efficacy at 12 mo, suggesting that radiosynoviorthesis is an effective treatment for improving synovitis symptoms.
Chapter
Patient-reported outcome measures (PROM) play a larger role in estimating disease status in telerheumatology. The National Institutes of Health (NIH) developed the Patient-Reported Outcomes Measurement Information System (PROMIS®) to measure outcomes for several diseases including rheumatic diseases. Measuring disease activity and functional capacity is standard of care for patients with rheumatoid arthritis (RA) or spondyloarthritis (SpA). The Routine Assessment of Patient Index Data 3 (RAPID3) and the Multidimensional Health Assessment Questionnaire (MDHAQ) may be used to measure disease activity and functional capacity respectively for several autoimmune and inflammatory rheumatic diseases (AIRD) and do not require modification for telerheumatology encounters. Estimating disease activity requires a more active role by the patient and specific laboratory testing. Instruments to monitor disease activity and functional status for patients receiving care by telerheumatology must be developed and validated to improve access to high-quality care for all patients with AIRD.
Article
Background The capability approach has received increasing attention in wellbeing measurement in the past years, but it has still remained an underexplored area in musculoskeletal (MSK) health. Objective We aimed to explore the capability wellbeing in relation to MSK health, by measuring the associations between the Health Assessment Questionnaire Disability Index (HAQ-DI) physical functioning and the ICECAP-A and ICECAP-O capability wellbeing measures. Design A cross-sectional survey was performed in 2019 on a representative sample of the Hungarian general adult population. Method Capability wellbeing was measured by the ICECAP-A (age-group 18-64) and ICECAP-O (age group 65+) questionnaires. MSK health was defined by the HAQ-DI, the mobility domain of the EQ-5D-3L/-5L health status measures, self-reported walking problems and MSK diagnosis (neck/back/low back defects, hip/knee arthrosis, osteoporosis). Results Altogether 2,021 individuals (female: 50.1%) participated in the survey with mean (SD) age of 48.7 (17.9) years and HAQ-DI of 0.138 (0.390). ICECAP-A (N=1568, 77.6%) and ICECAP-O (N=453, 22.4%) scores were on average (SD) 0.894 (0.126) and 0.828 (0.150), respectively. Spearman correlations between the HAQ-DI and ICECAP-A/-O index scores were moderate (r=-0.303 and -0.496; p<0.05). Both the ICECAP-A/-O index scores differed significantly (ANOVA test, p<0.05) across all MSK subgroups. In the ordinary least square regressions, marginal effects of ICECAP-A/-O scores on HAQ-DI were significant (-0.149 and -0.123) when controlling for socio-demographic characteristics. Conclusions MSK health problems are associated with lower capability wellbeing. ICECAP-A/-O might capture effects of MSK conditions not measured by the HAQ-DI or the EQ-5D-5L. Further studies should to test these associations in disease-specific samples.
Article
Objective. To assess clinical factors associated with disability and physical health in patients with systemic sclerosis (SSc) compared to psoriatic arthritis (PsA), systemic lupus erythematosus (SLE), and rheumatoid arthritis (RA) and healthy controls. Methods. Eighty-two patients with SSc, 82 with PsA, 74 with SLE, 42 with RA, and 60 controls were recruited from various rheumatology clinics and underwent physical examination, tender point count, Health Assessment Questionnaire Disability Index (HAQ-DI) and Short Form-36 Health Survey (SF-36) assessments. Results. SSc patients were younger and had shorter disease duration than the comparator groups. SSc patients with joint involvement had significantly poorer HAQ-DI scores than patients with PsA (1.43 vs 0.84; p < 0.05), and had higher visual analog scale pain scores than RA patients (1.37 vs 1.01; p < 0.05). The SF-36 Physical Component Summary and HAQ-DI score in SSc patients were adversely affected by joint involvement (p < 0.01, p < 0.001, respectively), >= 11 tender points (p < 0.01 p < 0.001), gastrointestinal (GI) involvement (p < 0.01, p < 0.01), and high skin score (p = 0.02 p < 0.001). Conclusion. Physical health relating to quality of life is adversely affected in patients with SSc. Disability is associated with the presence of >= 11 tender points, a high skin score. and joint and GI involvement. Joint involvement in SSc is more disabling than joint involvement in PsA; and patients with SSc experience more severe pain than patients with RA.
Article
Describes a method of item factor analysis based on Thurstone's multiple-factor model and implemented by marginal maximum likelihood estimation and the em algorithm. Statistical significance of successive factors added to the model were tested by the likelihood ratio criterion. Provisions for effects of guessing on multiple-choice items, and for omitted and not-reached items, are included. Bayes constraints on the factor loadings were found to be necessary to suppress Heywood cases. Applications to simulated and real data are presented to substantiate the accuracy and practical utility of the method. (PsycINFO Database Record (c) 2000 APA, all rights reserved)(unassigned)
Article
Describes a method of item factor analysis based on Thurstone's multiple-factor model and implemented by marginal maximum likelihood estimation and the em algorithm. Statistical significance of successive factors added to the model were tested by the likelihood ratio criterion. Provisions for effects of guessing on multiple-choice items, and for omitted and not-reached items, are included. Bayes constraints on the factor loadings were found to be necessary to suppress Heywood cases. Applications to simulated and real data are presented to substantiate the accuracy and practical utility of the method. (PsycINFO Database Record (c) 2000 APA, all rights reserved)(unassigned)
Article
In this paper, it is shown that various violations of the 2-PL model and the nominal response model can be evaluated using the Lagrange multiplier test or the equivalent efficient score test. The tests presented here focus on violation of local stochastic independence and insufficient capture of the form of the item characteristic curves. Primarily, the tests are item-oriented diagnostic tools, but taken together, they also serve the purpose of evaluation of global model fit. A useful feature of Lagrange multiplier statistics is that they are evaluated using maximum likelihood estimates of the null-model only, that is, the parameters of alternative models need not be estimated. As numerical examples, an application to real data and some power studies are presented.
Article
It is shown that differential item functioning can be evaluated using the Lagrange multiplier test or Rao’s efficient score test. The test is presented in the framework of a number of Item Response Theory (IRT) models such as the Rasch model, the one-parameter logistic model, the 2-parameter logistic model, the generalized partial credit model and the nominal response model. However, the paradigm for detection of differential item functioning presented here also applies to other IRT models. Two examples are given, one using simulated data and one using real data.
Book
This book makes clear to researchers what item-bias methods can (and cannot) do, how they work and how they should be interpreted. Advice is provided on the most useful methods for particular test situations. The authors explain the logic of each method - from item-response theory to nonparametric, categorical methods - in terms of how differential item functioning (DIF) is defined by the method and how well the method can be expected to work. A summary of findings on the behaviour of indices in empirical studies is included. The book concludes with a set of principles for deciding when DIF should be interpreted as evidence of bias.
Article
Thank you for your interest in: Ware, JE, Jr., Kosinski, M, Bjorner, JB, et al., User’s manual for the SF-36v2® Health Survey (2nd ed.). Lincoln, RI: QualityMetric Incorporated, 2007. This 309-page manual is available at and for loan from many university research libraries. Because it is the most requested book; Boston University research library has multiple copies for library loan. Good luck with your research, John E. Ware, Jr., PhD
Article
The Partial Credit model with a varying slope parameter has been developed, and it is called the Generalized Partial Credit model. The item step parameter of this model is decomposed to a location and a threshold parameter, following Andrich's Rating Scale formulation. The EM algorithm for estimating the model parameters was derived. The performance of this generalized model is compared with a Rasch family of polytomous item response models based on both simulated and real data. Simulated data were generated and then analyzed by the various polytomous item response models. The results obtained demonstrate that the rating formulation of the Generalized Partial Credit model is quite adaptable to the analysis of polytomous item responses. The real data used in this study consisted of NAEP Mathematics data which was made up of both dichotomous and polytomous item types. The Partial Credit model was applied to this data using both constant and varying slope parameters. The Generalized Partial Credit model, which provides for varying slope parameters, yielded better fit to data than the Partial Credit model without such a provision. Index terms: item response model polytomous item response model the Partial Credit model the Rating Scale model the Nominal Response model NAEP
Chapter
It is shown that the problem of evaluating model fit can be solved within the framework of the general multinomial model, and it is shown how tests for this framework can be adapted to the Rasch model. Four types of tests are considered: generalized Pearson tests, likelihood ratio tests, Wald tests, and Lagrange multiplier tests. The statistics presented not only support the purpose of a global overall model test, but also provide information with respect to specific model violations, such as violation of sufficiency of the sum score, strictly monotone increasing and parallel item response functions, unidimensionality, and differential item functioning.
Article
The establishment of measurement invariance across groups is a logical prerequisite to conducting substantive cross-group comparisons (e.g., tests of group mean differences, invariance of structural parameter estimates), but measurement invariance is rarely tested in organizational research. In this article, the authors (a) elaborate the importance of conducting tests of measurement invariance across groups, (b) review recommended practices for conducting tests of measurement invariance, (c) review applications of measurement invariance tests in substantive applications, (d) discuss issues involved in tests of various aspects of measurement invariance, (e) present an empirical example of the analysis of longitudinal measurement invariance, and (f) propose an integrative paradigm for conducting sequences of measurement invariance tests.