The Animal Verbal Fluency (AVF) and Design Fluency (DF) structured and unstructured test versions were administered to N = 294 healthy native Dutch-speaking children who were aged between 6.56 and 15.85 years. The AVF and DF structured test scores increased linearly as a function of age, whilst the relation between age and the DF unstructured test score was curvilinear (i.e., the improvement in test scores was much more pronounced for younger children than for older children). A higher mean level of parental education was associated with significantly higher AVF and DF structured test scores. Sex was not associated with any of the outcomes. Demographically corrected norms for the AVF and DF tests were established, and an automatic scoring program was provided.
Animal Verbal Fluency and Design Fluency
in school-aged children: Effects of age, sex, and mean
level of parental education, and regression-based
normative data
Keywords: Fluency development; Executive functions; Demographic influences; Continuous norms.
tings (Baron, 2004; Lezak, Howieson, & Loring, 2004;
Van der Elst, Van Boxtel, Van Breukelen, & Jolles,
2006a). Two types of fluency tests are distinguished—that
is, verbal and nonverbal fluency tests. In verbal fluency
tests, people are instructed to generate as many words as
possible that belong to a certain category (e.g., a seman-
tic category such as “animals,” or a phonemic category
such as “words that begin with the letter F”). In nonver-
bal (or design) fluency tests, people are asked to generate
as many novel (i.e., nonrepeating) abstract designs as
possible. Both types of fluency test are mainly used to
assess executive functioning and related abilities, such
as working memory, attention, and inhibition (Baldo,
Shimamura, Delis, Kramer, & Kaplan, 2001; Henry &
Crawford, 2004; Mitrushina, Boone, Razani, & D’Elia,
Address correspondence to Wim Van der Elst, Faculty of Medicine and Life Sciences, Department of Psychiatry
and Neuropsychology, Maastricht University, Dr. Tanslaan 12, 6200 MD, Maastricht, The Netherlands (E-mail:
2005; Ruff, Light, Parker, & Levin, 1997; Sergeant,
Geurts, & Oosterlaan, 2002; Van der Elst et al., 2006a).
In addition to these executive abilities, verbal fluency
tests tap word knowledge, access to semantic memory,
and vocabulary size (Ruff et al., 1997; Sergeant et al.,
2002), whilst design fluency tests also tap visuoconstruc-
tional abilities, motor planning, and visuomotor abilities
(Klenberg, Korkman, & Lahti-Nuuttila, 2001; Strauss,
Sherman, & Spreen, 2006). A diminished verbal fluency
and design fluency test performance has been associ-
ated with a variety of clinical conditions in both chil-
dren (e.g., fetal alcohol syndrome, closed head injury,
and autism; Korkman, Kirk, & Kemp, 1998; Levin,
Song, Ewing-Cobbs, Chapman, & Mendelsohn, 2001;
Schonfeld, Mattson, Lang, Delis, & Ripley, 2001) and
adults (e.g., Alzheimer’s disease, Parkinson’s disease, and
traumatic brain injury; Diesfeldt, Van der Elst, & Jolles,
2009; Flowers, Robertson, & Sheridan, 1995; Raskin &
Rearick, 1996).
The aim of the present study was to establish the nor-
mal range of performance on the Animal Verbal Fluency
(AVF) and the Design Fluency (DF) tests, using a sample
of N=294 healthy native Dutch-speaking children who
were aged between 6.56 and 15.85 years. Normative data
because they provide an empirical frame of reference to
evaluate the test performance of an individual (Capitani,
1997; Mitrushina et al., 2005; Van der Elst, 2006). The
first step in a normative analysis is to identify the demo-
graphic variables that are associated with test perfor-
mance (e.g., age, sex, and parental education), so that
the normative data can be appropriately corrected for the
relevant demographic influences. Previous research has
suggested that better performance on the AVF and DF
tests in children is mainly associated with a higher age
(Anderson, 1998; Ardila, Rosselli, Matute, & Guajardo,
2005; Brocki & Bohlin, 2004; Cohen, Morgan, Vaughn,
Riccio, & Hall, 1999; Hurks et al., 2010; Jones-Gotman,
1990; Jurado & Rosselli, 2007; Klenberg et al., 2001;
Korkman, Kemp, & Kirk, 2001; Matute, Rosselli, Ardila,
& Morales, 2004; Prigatano, Gray, & Lomay, 2008;
but see also Anderson, Anderson, Northam, Jacobs, &
Catroppa, 2001) and a higher level of parental educa-
tion (Ardila et al., 2005; Hurks et al., 2010; Hurks et al.,
2006; Klenberg et al., 2001). The effect of sex on fluency
test performance remains unclear. Ardila and colleagues
(2005) found that sex differences occurred in fluency test
performance, but the direction of the effect depended on
the type of fluency test that was considered (i.e., boys out-
performed girls in verbal fluency tests, but the opposite
pattern was found for design fluency tests). Other stud-
ies did not observe significant sex differences in fluency
test performance (Brocki & Bohlin, 2004; Riva, Nichelli,
& Devoti, 2000) or only minimal effects (e.g., Prigatano
et al., 2008, reported that gender accounted for only 1.5%
of the variance in verbal fluency performance).
Two different methods are often used to estab-
lish normative data—that is, the “traditional” and the
“regression-based” methods. In the traditional approach,
normative statistics (usually the means and SDs) are
calculated for each relevant demographic subgroup sepa-
rately. Based on these normative statistics, an individual’s
observed test score is converted into an easily inter-
pretable metric (such as a zscore or a tscore). The tra-
ditional normative approach is straightforward and well
known by most psychologists, but there are also some
drawbacks associated with it (for details, see, e.g., Van
Breukelen & Vlaeyen, 2005; Van der Elst, Van Boxtel,
Van Breukelen, & Jolles, 2006b; Testa, Winicki, Pearlson,
Gordon, & Schretlen, 2009). For example, performance
on many (neuro)psychological tests is influenced by
different demographic variables, such as age, sex, and
educational level (Lezak et al., 2004; Mitrushina et al.,
2005). As a result, the total sample has to be subdivided
into many different subgroups. For example, splitting
a sample into two parental education groups and five
age groups results in a total of 10 (=5×2) sub-
groups. The splitting of the data in subgroups yields less
precisely estimated normative statistics (i.e., the discrep-
ancy between the sample statistics and the true popula-
tion parameters increases as the sample size decreases),
and consequently more biased norms are obtained. In
addition, there is a problem with the boundary values
when the traditional normative approach is used. For
example, suppose that the normative data were strati-
fied in age bands that span two years (e.g., 6–8 years,
>8–10 years, etc.). This would imply that children aged
7.99 and 8.01 years (differing only by a few days) would
be evaluated against different normative data, whereas
children aged 6.01 and 7.99 years (differing by almost
two years) would be evaluated against the same norma-
tive data, which is not acceptable (Capitani, 1997). The
boundary problem could be reduced by using narrower
age bands, but then the total sample has to be split into
more subgroups (which increases the bias in the norma-
tive statistics because the subgroup sample sizes decrease;
see above).
In the regression-based normative method, multi-
ple linear regression models are used to compute the
expected test scores of an individual (rather than the sub-
group means that are used in the traditional approach).
Since both continuous (e.g., age in years instead of age
groups) and categorical demographic variables (e.g., level
of parental education) can be handled in the regression-
based approach, there is no need to split the data into
subgroups nor is there a problem with boundary val-
ues. Regression-based normative conversions are highly
individualized (i.e., they take the unique demographic
characteristics of each tested individual into account;
Crawford & Howell, 1998; Heaton, Miller, Taylor, &
Grant, 2004; Van Breukelen & Vlaeyen, 2005; Van der
Elst et al., 2006a, 2006b; Van der Elst, Van Boxtel, Van
Breukelen, & Jolles, 2007), but this advantage also comes
with a disadvantage. Indeed, regression-based norms are
less user-friendly because the user of the normative data
has to actively compute the expected test scores as based
on the unique demographic characteristics of the tested
individual (i.e., it is not possible to provide an exhaus-
tive normative table for each possible combination of the
demographic variables when at least one demographic
variable is continuous). The latter problem can, however,
be easily avoided by establishing simplified normative
tables or by providing a computer program that automat-
ically executes the normative conversions (as was done in
the present study, see below).
Thus, in the present study we used a regression-
based approach to establish the normative data for
the AVF and DF tests (as based on a sample of
N=294 healthy native Dutch-speaking children). The
influence of age, gender, and parental level of education
on test performance was evaluated and corrected for if
The data were derived from the COOS (Cognitief
Ontwikkelings Onderzoek bij Schoolgaande kinderen,in
Downloaded by [University of Maastricht] at 00:24 24 November 2011
English: cognitive developmental study in school-aged
children), a large-scale study into “normal” cognitive
development. A total of N=294 children were admin-
istered the AVF and DF tests. None of these children had
repeated or skipped a grade. Medication use and health
status were assessed by means of a parental-report ques-
tionnaire. Children who used medication that is known to
affect cognitive performance (such as Ritalin) or children
who had clinical conditions that are known to affect cog-
nition (such as epilepsy or attention-deficit/hyperactivity
disorder, ADHD) were excluded from the sample. All
children who participated in the study were native Dutch
speakers, and all parents (or caregivers) of the chil-
dren gave consent for their child to participate in the
Basic demographic data for the eligible sample
(N=294) were provided (see Table 1). The age range of
the eligible children was between 6.56 and 15.85 years.
Age was used as a continuous variable in the norma-
tive procedures (see Introduction), but it was categorized
in bands of 1.5 years in Table 1 for descriptive pur-
poses. The educational level of the children’s parents (or
caregivers) was assessed with a commonly used Dutch
educational 8-point rating scale, which ranges from pri-
mary school to university degree (De Bie, 1987). Mean
level of parental education (MLPE) was dichotomized
into low and high groups (after a median split) for par-
ents who had MLPE values that were <5and5on
the 8-point scale, respectively (with 5 =at most junior
vocational education). These two levels of education cor-
respond with a mean (SD) of about 9.88 (2.59) and
14.68 (3.30) years of full-time education, respectively.
The Ethics Committee of the Faculty of Psychology
and Neuroscience of Maastricht University approved the
study protocol, and all data included in this manuscript
were obtained in compliance with the regulations of this
Procedure and instruments
The fluency tests were administered to each child indi-
vidually. In the Animal Verbal Fluency (AVF) Test, the
following instruction was given: “Next, I’m going to give
you one minute to tell me all the animals you can think
of. They can be any kind of animals that you can think of,
such as birds, fish, etcetera. Any questions? Go as fast as
you can. Ready? Go.” The total number of correct, non-
repeated animal names is scored and is referred to as the
AVF test scores.
The Design Fluency (DF) Test is part of the
Developmental NEuroPSYcological Assessment
(NEPSY; Korkman et al., 1998). In the DF, the children
were asked (following standard NEPSY test instructions)
to draw as many novel designs as possible (in 60 s) by
connecting two or more dots that were arranged in a
structured or an unstructured array. The test material
consisted of a large sheet with 70 squares, with five dots
in each square. In the structured array, the five dots were
arranged in the same spatial configuration as the number
five on a dice. In the unstructured array, the five dots
were randomly distributed in the square (note that all
squares contained the same random pattern). Scoring
followed the standard NEPSY rules. The number of
correctly generated structured designs and unstructured
designs are referred to as the DF structured and DF
unstructured test scores, respectively.
Statistical analyses
In the exploratory data analysis phase, means and
standard deviations were computed for the AVF, the
DF structured, and the DF unstructured test scores,
and Pearson zero-order correlations between the out-
comes and the independent variables were computed.
The effects of the demographic variables on the AVF,
DF structured, and DF unstructured test scores were
further analyzed with multiple linear regression analy-
ses. The full regression models included age, age2,sex,
MLPE, and all two-way interactions as predictors. Age
was centered (=calendar age – mean age in the sample,
=11 years) before computing the quadratic age term to
avoid multicollinearity (Aiken & West, 1991). Sex was
dummy coded as 1 =male and 0 =female. MLPE was
dummy coded as 1 =high and 0 =low. The full regression
models were reduced in a step-down hierarchical proce-
dure by excluding the nonsignificant predictors from the
The assumptions of regression analysis were tested
for each model: normal distribution of the residuals (by
conducting Kolmogorov–Smirnov tests on the residual
Basic demographic data
Age Gender Mean level of parental education
Age range (years) N M (SD) Male Female Low High
<8 63 7.13 (0.29) 26 37 35 28
8and<9.5 46 9.02 (0.28) 23 23 24 22
9.5 and <11 23 10.34 (0.63) 14 9 10 13
11 and <12.5 44 11.37 (0.20) 11 33 21 23
12.5 and <14 64 13.37 (0.39) 32 32 25 39
14 54 14.81 (0.52) 30 24 25 29
Total 294 11.08 (2.82) 136 158 140 154
Note. N =294.
Downloaded by [University of Maastricht] at 00:24 24 November 2011
values), homoscedasticity (by grouping participants into
quartiles of the predicted scores and applying the Levene
test to the residuals), multicollinearity (by calculating the
variance inflation factors, which should not exceed 10;
Belsley, Kuh, & Welsch, 1980), and influential cases (by
calculating Cook’s distances).
Regression-based normative data were established
by means of a four-step procedure (Van Breukelen
& Vlaeyen, 2005; Van der Elst et al., 2006a, 2006b,
2007). First, the expected fluency test scores were com-
puted by means of the final multiple regression models
0=the intercept,
Bn=the regression weights for the demographic vari-
ables, and Xn=the values of the demographic variables).
Second, the residuals were calculated (ei=observed score
– expected score). Third, the residuals were standardized:
Zi=ei/SD(residual), with SD(residual) =the stan-
dard deviation of the residuals in the normative sample.
If heteroscedasticity occurred, the SD(residual) values
that were used in the standardization of the residuals
were computed per quartile of the predicted scores (Van
Breukelen & Vlaeyen, 2005; Van der Elst et al., 2007).
Fourth, the standardized residuals were converted into
percentile values (via the standard normal cumulative dis-
tribution function if the model assumption of normality
of the standardized residuals was met in the normative
sample, or via the empirical cumulative distribution func-
tion of the standardized residuals if the standardized
residuals were not normally distributed in the normative
An alpha level of .01 was chosen in all analyses to
avoid Type I errors due to multiple testing (Bonferroni
correction for the number of dependent variables, critical
=.01). All analyses were conducted with the
“R” software package Version 2.10.1 for Linux.
Exploratory data analyses
The mean (SD) AVF, DF structured, and DF unstruc-
tured test scores equaled 18.76 (6.09), 14.16 (5.72), and
14.32 (4.75), respectively. As shown in Table 2, the inter-
correlations between all fluency test scores were signif-
icantly positive (all r.45; p<.01). There was a
significantly positive correlation between age and all out-
come measures (all r.52) and between MLPE and the
AVF and the DF structured test scores (all r.20; all
p<.01). Sex was not significantly correlated with any of
the outcome variables (all r.09, all p>.01).
The effects of demographical variables on fluency
test performance
The final multiple linear regression models are shown
in Table 3. The DF structured test score was square
root transformed because preliminary analyses suggested
that heteroscedasticity occurred with the untransformed
scores (i.e., the error variance increased as a function of
the predicted test scores). After this transformation, the
assumptions of multiple linear regression analysis were
met for all final models (i.e., all values of the Levene
statistics 3.055, all p>.01; all Kolmogorov–Smirnov
Z-values 1.496, all p>.01; all Cook’s distance values
0.037; and all Variance Inflation Factors 1.01).
The AVF and DF structured test scores increased lin-
early as a function of age (see Table 3). In addition to
the linear age effect, the DF unstructured test score was
affected by a quadratic age effect. Figure 1 graphically
presents the relative effects of age on the different flu-
ency test scores.1Children who had parents with a high
MLPE obtained higher AVF and DF structured test
scores than children who had parents with a low MLPE.
Sex did not affect any of the outcome measures, and none
of the interaction terms reached significance. The final
regression models explained between 29 and 54% of the
variance in the fluency test scores.
Normative procedure
Norms for the AVF, DF structured, and DF unstruc-
tured test scores are established by means of the four-step
1Figure 1 presents the expected fluency test scores at a given
age relative to the expected test scores at 6.5 years. Relative rather
than raw expected test scores were presented for reasons of clar-
ity and comparability (i.e., the different outcome measures were
expressed in different units—raw values versus square root trans-
formed values—which makes it difficult to directly compare raw
expected test scores). Note also that Figure 1 presents relative
values for children who have parents with a low MLPE, but
the shape of these curves is very similar for children who have
parents with a high MLPE.
Zero-order correlations between the fluency test scores and the demographic variables
1: Animal Verbal Fluency test score 1.00
2: Design Fluency structured test score .451.00
3: Design Fluency unstructured test score .46.801.00
4: Age .52.66.651.00
5: Gender .09 .07 .03 .08 1.00
6: Mean level of parental education .22.20.10 .09 .04 1.00
Downloaded by [University of Maastricht] at 00:24 24 November 2011
Final multiple linear regression models for the Animal Verbal Fluency, the Design Fluency structured, and the Design Fluency
unstructured test scores resulting from a step-down hierarchical procedure
Test score Variable B SE B Std. B T SD (residual) R2
Animal Verbal Fluency (constant) 17.60 0.44 40.34
Age 1.09 0.11 0.65 10.17
MLPE 2.00 0.60 0.13 3.325.12 .29
Design Fluency structured (constant) 3.56 0.05 73.20
Age 0.18 0.01 0.50 14.94
MLPE 0.21 0.07 0.16 3.090.57 .54
Design Fluency unstructured (constant) 14.95 0.32 46.86
Age 1.05 0.07 0.64 14.63
Age20.09 0.03 0.13 2.89 3.42 .43
Note. Encoding of the predictors: age =calendar age – 11, age2=(calendar age – 11)2. MLPE (mean level of parental education): low =
0, high =1. R2is adjusted for the number of predictors in the regression equations.
Figure 1. Relative expected Verbal Fluency, Design Fluency
structured, and Design Fluency unstructured test scores as a
function of age. Relative expected test scores were presented
rather than raw expected test scores for reasons of clarity and
comparability (i.e., the different raw test scores were expressed
in different units).
procedure described above. For example, suppose that
a 10-year-old child who has parents with a low MLPE
generated 15 animal names in 60 s. The first step in the
normative procedure is to calculate the expected AVF test
score for this child (by means of the regression models
that were presented in Table 3), which equals 16.51—that
is, 17.6 +[(10 – 11) ×1.09] +(2.00 ×0). Secondly,
the residual is calculated—that is, –1.51 (=15 – 16.51).
Thirdly, the residual is standardized by means of the
SD(residual) value of the AVF model (see Table 3)—that
is, –0.29 (=–1.51/5.12). Fourthly, the standardized resid-
ual is converted into a percentile value by means of the
standard normal cumulative distribution. A standard-
ized residual equal to –0.29 corresponds with a percentile
value of 38. Thus, 38% of the “normal” population of
10-year-old children who have parents with a low MLPE
obtain a VFT score that is lower than 15. The AVF test
performance of this child is thus within normal limits.
User-friendly normative data
The four-step normative procedure provides highly indi-
vidualized norms but lacks user-friendliness because the
users of the norms have to actively make the required
computations (as described in the previous paragraph).
Therefore, we also provided simplified normative tables
that were derived from the four-step normative procedure
described above (see Tables A1–A3 in the Appendix).
Note that Table A2 provides normative data for the
raw DF structured test score rather than for the DF
structured test score, also in view of increasing the user-
friendliness of the norms. The use of the normative
tables is straightforward. For example, Table A1 (in the
Appendix) immediately shows that the VFT scale score
of 15 that was obtained by 10-year-old child from the
example corresponds to a percentile value between 30
and 50.
The normative tables in the Appendix are user-friendly
but lack some accuracy because a child’s age has to be
rounded up if he or she is not exactly 6.5, 7, 7.5,...,
or 15.5 years old, and because only a limited num-
ber of percentile values can be presented in these tables
(due to space limitations). To maximize both the user-
friendliness and the accuracy of the norms, a computer-
based algorithm of the four-step normative procedure
Downloaded by [University of Maastricht] at 00:24 24 November 2011
was implemented in a Microsoft Excel worksheet. In this
worksheet, the user of the norms types in the age of the
tested child and the MLPE of the parents of the tested
child together with his or her observed AVF, DF struc-
tured, and DF unstructured test scores. The worksheet
automatically converts the test scores into percentiles by
using the four-step procedure described above and graph-
ically presents the results. This worksheet is available
via the ‘Supplementary’ tab on the article’s online page
The main aim of the present study was to establish
the normal range of performance on the AVF, the DF
structured, and the DF unstructured tests in school-aged
children. As a first step in the normative procedure, the
influence of demographic variables on test performance
was evaluated. In contrast to most previous studies, we
used multiple linear regression models that included both
linear and quadratic age effects as predictors of test per-
formance. An advantage of using such models is that
linear age effects can be distinguished from quadratic
age effects. The results showed that the AVF and the
DF structured test scores increased linearly as a func-
tion of age at a rate of approximately 8.5% and 6.5%
units per year, respectively (see Figure 1). In contrast, the
curvilinear relation between age and the DF unstructured
test score suggested that the relative improvement in test
performance was much more pronounced for younger
children than for older children (see Figure 1). For exam-
ple, the DF unstructured test score of a 7.5-year-old child
was about 21.1% higher than the DF unstructured test
score of a 6.5-year-old child, whilst the DF unstructured
test score of a 15.5-year-old child was only about 3.9%
higher than the test score of a 14.5-year-old child.
In addition to age (and age2for the DF unstructured
test score), MLPE profoundly affected the AVF and the
DF structured test scores. For example, children who
had parents with a high MLPE generated on average
two words more than children who had parents with
a low MLPE. A difference of two words is substantial
and corresponded with about two years of development
(i.e., 2.00/1.09 =1.83 years; see Table 3)—for exam-
ple, the expected AVF test score of a 10-year-old child
who had parents with a high MLPE was approximately
equal to the expected AVF test score of a 12-year-old
child who had parents with a low MLPE. Previous stud-
ies also found that MLPE is associated with verbal and
design fluency test performance in children (Ardila et al.,
2005; Hurks et al., 2010; Hurks et al., 2006; Klenberg
et al., 2001), but these studies did not make a distinction
between the structured and the unstructured test versions
of the DF—as was done in the present study. We found
that MLPE had a differential effect on DF test perfor-
mance: MLPE was significantly associated with the DF
structured test score, but not with DF unstructured test
score. This result was unexpected, because it is generally
assumed that the DF structured and the DF unstructured
test versions measure the same underlying construct(s).
However, if this were to be the case, it would be expected
that the scores on both test versions would be influenced
in a similar way by an important demographic variable
such as MLPE—which was not the case.
It is possible that Type I errors accounted for the dif-
ferential effects of MLPE on the DF structured and the
DF unstructured test scores, but this explanation seems
to be unlikely. Indeed, the correlation between MLPE
and the DF unstructured test score was not significantly
different from zero (i.e., r=.10; see Table 2), and the
p-value of the regression weight of MLPE was high when
MLPE was forced to remain in a regression model that
also included age and age2as predictors (p=.43; data
not shown). Another possibility for explaining the differ-
ential effects of MLPE on the DF structured and the DF
unstructured test scores is that both test versions measure
several common constructs (such as executive functions
and visuoconstructional abilities), but also tap constructs
that are unique for each test version. Circumstantial evi-
dence for this claim can be found in the high—but not
extreme—correlation between the DF structured and the
DF unstructured test scores (i.e., r=.80; see Table 2).
For example, it is possible that mental rotation is a con-
struct that is tapped by the DF structured test version,
but not by the DF unstructured test version. Indeed,
the symmetrical organization of the array of dots of the
DF structured test version allows the child to generate
new designs by, for example, mentally rotating existing
designs, but this is not possible in the DF unstructured
test version (because of the nonsymmetrical organization
of its array of dots). Previous research has also shown
that MLPE is associated with the mental rotation abilities
of a child (see, e.g., Noble, McCandliss, & Farah, 2007),
which could thus explain the differential associations of
MLPE with the DF structured and DF unstructured test
scores. It is unknown whether this explanation is correct
(no empirical studies have yet evaluated the association
between mental rotation abilities and performance on
the DF structured and unstructured test versions), but
no matter what explanation accounts for the differen-
tial effects of the demographic variables on fluency test
performance, from a psychometric viewpoint this finding
implies that the normative data should be differentially
corrected for the relevant demographic influences. Thus
the normative data for the VFT and the DF structured
test scores were corrected for linear age-effects and for
MLPE, whilst the normative data for the DF unstruc-
tured test score were corrected for linear and quadratic
age-effects. In agreement with most previous studies (see
Introduction), sex had no significant influence on the
AVF and DF scores. Normative tables were established,
as well as an automatic scoring program that maxi-
mizes both the user-friendliness and the accuracy of the
There are some limitations of this study that warrant
further discussion. First, the present study involved a
sample of Dutch-speaking children. The question arises
whether normative data that were based on a Dutch-
speaking sample can also be applied to children with
a different native language or a different cultural back-
ground. Research in adults and the elderly has shown
Downloaded by [University of Maastricht] at 00:24 24 November 2011
that linguistic factors significantly affect AVF test perfor-
mance. For example, Kempler and colleagues (Kempler,
Teng, Dick, Taussig, & Davis, 1998) found that Spanish
speakers generated the smallest number of animal names
in comparison to Chinese and English speakers, and
Vietnamese speakers generated most animal names. The
researchers related these differences to differences in the
length of words for animal names in these languages
(with animal names being longest in Spanish and short-
est in Vietnamese). In children, the effects of linguistic
factors on AVF test performance have not yet been eval-
uated, but it is conceivable that similar influences exist.
It is thus not recommended to use the AVF test norms
that were established in the present study to evaluate
the test performance of non-Dutch-speaking children—
at least not until the influence of linguistic factors on
AVF test performance in children has been evaluated in
more detail. Linguistic differences per se are unlikely to
affect the DF test performance, but cross-cultural differ-
ences in test performance may nevertheless arise because
of differences in general factors such as “degree of famil-
iarity with formal testing” (Ardila, 1995). In Western
countries, differences in these factors are minimal and
are unlikely to substantially affect test performance, but
larger differences may exist between children who live
in Western and in non-Western countries. It is therefore
not recommended to use our DF structured and DF
unstructured test norms to evaluate the test performance
of non-Western children.
Second, the regression models (and normative data)
that were established in the present study were based
on the data of a sample of children who were aged
between 6.56 and 15.85 years. The question rises whether
the regression models can also be used to compute the
expected test scores for children whose age falls outside
the age range that was considered during the regression
model building phase. Extrapolating a regression model
requires several assumptions. For example, if we want to
estimate the expected test scores of a 17-year-old child
(as based on the regression models that were presented in
Table 3), we have to assume that the relation between age
and fluency test performance in the age range between
15.85 and 17 years is identical (or at least very similar)
to the relation between age and fluency test performance
in the age range between 6.56 and 15.85 years. If this
assumption is not valid, the resulting normative data may
be severely biased (e.g., a child with a “normal” test
performance may be classified as being “impaired,” or
vice versa). Thus, extrapolation of the regression models
beyond the age range that was considered during the
model building phase is not recommended.
