ArticlePDF Available

Meta-Analysis of the Impact of Reading Interventions for Students in the Primary Grades

Authors:
  • Instructional Research Group
  • Instructional Research Group

Abstract and Figures

This meta-analysis systematically reviewed the most up-to-date literature to determine the effectiveness of reading interventions on measures of word and pseudoword reading, reading comprehension, and passage fluency, and to determine the role intervention and study variables play in moderating the impacts for students at risk for reading difficulties in Grades 1–3. We used random-effects meta-regression models with robust variance estimates to summarize overall effects and to explore potential moderator effects. Results from a total of 33 rigorous experimental and quasi-experimental studies conducted between 2002 and 2017 that met WWC evidence standards revealed a significant positive effect for reading interventions on reading outcomes, with a mean effect size of 0.39 (SE = .04, p < .001, 95% CI [0.32, 0.46]). Moderator analyses demonstrated that mean effects varied across outcome domains and areas of instruction.
Content may be subject to copyright.
Full Terms & Conditions of access and use can be found at
https://www.tandfonline.com/action/journalInformation?journalCode=uree20
Journal of Research on Educational Effectiveness
ISSN: 1934-5747 (Print) 1934-5739 (Online) Journal homepage: https://www.tandfonline.com/loi/uree20
Meta-Analysis of the Impact of Reading
Interventions for Students in the Primary Grades
Russell Gersten, Kelly Haymond, Rebecca Newman-Gonchar, Joseph Dimino
& Madhavi Jayanthi
To cite this article: Russell Gersten, Kelly Haymond, Rebecca Newman-Gonchar, Joseph Dimino
& Madhavi Jayanthi (2020) Meta-Analysis of the Impact of Reading Interventions for Students
in the Primary Grades, Journal of Research on Educational Effectiveness, 13:2, 401-427, DOI:
10.1080/19345747.2019.1689591
To link to this article: https://doi.org/10.1080/19345747.2019.1689591
Published online: 09 Jan 2020.
Submit your article to this journal
Article views: 148
View related articles
View Crossmark data
THEORY, CONTEXT, AND MECHANISMS
Meta-Analysis of the Impact of Reading Interventions for
Students in the Primary Grades
Russell Gersten
a
, Kelly Haymond
a
, Rebecca Newman-Gonchar
a
, Joseph Dimino
a
and Madhavi Jayanthi
a
ABSTRACT
This meta-analysis systematically reviewed the most up-to-date litera-
ture to determine the effectiveness of reading interventions on meas-
ures of word and pseudoword reading, reading comprehension, and
passage fluency, and to determine the role intervention and study vari-
ables play in moderating the impacts for students at risk for reading
difficulties in Grades 13. We used random-effects meta-regression
models with robust variance estimates to summarize overall effects
and to explore potential moderator effects. Results from a total of 33
rigorous experimental and quasi-experimental studies conducted
between 2002 and 2017 that met WWC evidence standards revealed a
significant positive effect for reading interventions on reading out-
comes, with a mean effect size of 0.39 (SE ¼.04, p<.001, 95%
CI [0.32, 0.46]). Moderator analyses demonstrated that mean effects
varied across outcome domains and areas of instruction.
ARTICLE HISTORY
Received 29 January 2019
Revised 22 October 2019
Accepted 31 October
2019
KEYWORDS
Reading; response to
intervention; multi-tiered
system of support; Tier 2
intervention; meta-analysis
Multi-tiered systems of support (MTSS), also referred to as Response to Intervention
(RtI), have become routine in American elementary schools, especially in the area of lit-
eracy/reading. In 20102011, for example, full implementation of MTSS in Grade 1
reading occurred in 71 percent of schools from a demographically representative sample
(Balu et al., 2015). The massive scale-up of MTSS was fueled by major pieces of federal
legislation, such as the Reading First portion of the No Child Left Behind Act (NCLB,
2002), the Individuals with Disabilities Education Act (IDEA, 2004), and the Every
Student Succeeds Act (ESSA, 2015). ESSA explicitly called for an emphasis on evidence-
based interventions with strongand moderatelevels of evidence, based on the What
Works Clearinghouse (WWC) standards (U.S. Department of Education [U.S. ED],
Institute of Education Sciences [IES], & What Works Clearinghouse, 2013).
With the rapid and widespread implementation of MTSS in reading across schools, a
national study was undertaken to examine the impact of high-quality MTSS in reading
(identified as high-quality by experts and onsite evaluation teams) on the performance
of students in primary grades (Balu et al., 2015). To the surprise of many in the reading
research community, the evaluation found statistically significant negative effects on
Grade 1 reading performance and non-significant impacts on Grades 2 and 3 reading
!2019 Taylor & Francis Group, LLC
CONTACT Kelly Haymond khaymond@inresg.org Instructional Research Group, 4281 Katella Avenue, Suite 205,
Los Alamitos, California 90720, USA.
a
Instructional Research Group, Los Alamitos, California, USA
JOURNAL OF RESEARCH ON EDUCATIONAL EFFECTIVENESS
2020, VOL. 13, NO. 2, 401427
https://doi.org/10.1080/19345747.2019.1689591
performance in 146 elementary schools in 13 states using a regression discontinuity
design (Imbens & Lemieux, 2008).
Some in the reading research community raised concerns about the study design, the
use of a regression design to answer questions about effectiveness, how impacts were com-
bined across districts that use different curricula, and the limited monitoring of fidelity of
implementation (e.g., Fuchs & Fuchs, 2017; Gersten, Jayanthi, & Dimino, 2017). Yet, the
findings are hard to ignore. They raise questions about the effectiveness of interventions in
authentic settings, specifically, How could interventions based on principles from scien-
tific research or those that were actually shown to be evidence-based according to current
standards be ineffectual or even slightly negative in practice?
There are at least three plausible reasons for the finding. The first is that the evidence
based on the effectiveness of beginning reading interventions may not be as robust or
consistent as previously believed. The second is that there is a body of rigorously con-
ducted research documenting the effectiveness of beginning reading interventions and
the instructional practices used in the research, but these interventions were not imple-
mented with fidelity in practice. The third could be a result of the types of outcome
measures used. Measure types have been found to moderate the size of the impacts
(Elleman, Lindo, Morphy, & Compton, 2009). Students in the national evaluation were
assessed on comprehensive measures of reading performance, as opposed to measures
that measure discrete reading proficiencies, often well aligned with the interven-
tions focus.
Rather than listing the names of interventions with rigorous research support, as
done for example on the WWC website, we thought it more important to conduct a
careful examination and rigorous meta-analysis of the body of contemporary research
on reading interventions relevant to MTSS. Doing so would allow us to articulate the
instructional practices and principles that underlie the interventions found to be effect-
ive, and to delineate the outcomes and the grade levels for which there is strong
evidence to support intervention. It also seemed important to examine factors such as
the type of interventionist and the level of support and monitoring provided to the
interventionist.
First, we briefly review prior meta-analyses and literature reviews on this topic.
A Brief Summary of Meta-Analyses and Literature Reviews on Reading
Interventions
Six syntheses of research (either literature reviews or meta-analyses) address the topic of
reading interventions in the primary grades, albeit from different perspectives. Three of
these do not examine the impacts of reading interventions: Al Otaiba and Fuchs (2002),
Stuebing et al. (2015), and Tran, Sanchez, Arellano, and Swanson (2011) explored the
relationship between studentsprior reading abilities and other learner characteristics
and their responsiveness to reading intervention.
Slavin, Lake, Davis, and Madden (2011) examined 70 studies of programs geared
toward providing support to struggling readers in elementary school (K-5), including
not only the small-group interventions (20 studies) and one-on-one interventions (20
studies) typical for Tier 2 of MTSS, but also whole-class instructional practices (16
402 R. GERSTEN ET AL.
studies) and computer-based (14 studies) instruction for at-risk readers. The studies in
the review were not evaluated for quality, though the authors did limit their review to
studies that included randomization or matching to form a comparison group, a far
lower standard than the WWC standard used for these studies. Studies were only
included if the program lasted at least 12 weeks. Effects ranged from 0.09 for computer-
based interventions to 0.56 for whole-class interventions. One-on-one and small group
interventions resulted in effects of 0.39 and 0.31, respectively, suggesting that both
approaches show evidence of promise for Tier 2 intervention.
Wanzek and colleagues conducted two meta-analyses of reading intervention studies
involving students in kindergarten through Grade 3. The first (Wanzek et al., 2016) exam-
ined studies of shorter interventions (lasting less than 100 sessions); the latter (Wanzek
et al., 2018) included only longer interventions (lasting more than 100 sessions). The meth-
odology was similar, incorporating RCTs and QEDs only, but not ensuring that the study
met rigorous WWC standards. The first set included 72 studies, published between 1995
and 2013. The second, more recent meta-analysis of longer interventions located only 25
studies published from 1995 to 2015, using similar methodology.
For the first meta-analysis of shorter interventions, the authors classified outcomes
into one of two domains developed to correspond with the simple view of reading
(Francis, Kulesz, & Benoit, 2018). The first addressed the broad area of decoding,
including pre-reading skills (phonological awareness, rhyming, letter identification), as
well as measures of decoding of phonetically regular words and pseudowords, and word
reading and fluency. The second domain was called multicomponent and was essentially
a composite of both listening comprehension and reading comprehension, as well as
oral vocabulary and reading vocabulary. Using a random effects model, the authors
found positive mean effects ranging from 0.54 to 0.62 on the composite outcomes of
foundational reading and reading-related skills and 0.36 to 1.02 on language and com-
prehension measures. There was no evidence that group size, intervention type, grade
level, or interventionist were related to the magnitude of impacts.
Findings from the 2018 meta-analysis of longer interventions produced a mean effect
size of 0.28 when corrected for publication bias. The effects were found to be homoge-
neous, precluding the use of moderator analysis or meta-regression.
Rationale for the Present Meta-Analysis of Tier 2 Reading Interventions in
Grades 13
We decided to conduct a new meta-analysis for several reasons. One reason is that we
wanted to use robust variance estimation (RVE; Hedges, Tipton, & Johnson, 2010), a
more contemporary approach that addresses dependent effect sizes arising from multiple
outcomes and comparisons within studies. RVE allows researchers to model all depend-
encies statistically, compared to traditional meta-regression approaches, which address
dependencies by selecting specific comparisons, selecting a single measure, or aggregat-
ing all measures by computing an average effect. Of the six related syntheses we found,
only the most recent meta-analysis by Wanzek and colleagues (2018) used RVE but that
study focused on longer reading interventions. The current study used RVE on a set of
studies that have not been previously examined with this type of analysis.
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 403
A second reason for conducting a new meta-analysis is to limit the grade level to
those in which students are expected to begin reading. Typically, kindergarten interven-
tion studies include very few, if any, reading measures. Instead, they often include meas-
ures of pre-reading or reading-related skills such as listening comprehension, rhyming,
and phonemic awareness. The three previous reviews (Slavin et al., 2011; Wanzek et al.,
2016,2018) included kindergarten studies and measures of pre-reading skills
(i.e., phonological awareness, listening comprehension). As our goal was to determine
whether students receiving the intervention progressed beyond the pre-reading stage
and truly learned to read, we only included studies from Grades 1, 2, and 3 to reflect
the grades included in the national RtI evaluation study (Balu et al., 2015). We also lim-
ited the outcomes to include only measures of reading performance (word reading, pas-
sage fluency, reading comprehension), which is in line with the ESSA standards for
evaluating study outcomes in the primary grades (Center for Research and Reform in
Education & Johns Hopkins University, 2019) and the framework used for assessing stu-
dent reading performance in the Reading First national evaluation (Gamse, Jacob,
Horst, Boulay, & Unlu, 2008). (Reading First used reading comprehension and decoding
to assess the reading performance of struggling students in Grades K2). We did not
include studies of interventions above Grade 3 because the interventions in Grades 4
and 5, for example, are very different than those in Grades 13: They focus more on
comprehension, vocabulary development, and fluency building, and less on decoding.
Finally, given the focus of ESSA (2015) on using interventions with moderateto
stronglevels of evidence from studies that have met WWC standards (Version 3.0;
U.S. ED et al., 2013) for high-quality causal studies, we wanted to conduct a formal
review of the studies using WWC standards and include only those studies in the meta-
analysis that met those standards. The current set of studies is thus a more focused set
that has been screened for the designs rigor and the findingstrustworthiness.
Purpose of the Present Meta-Analysis
The purpose of this meta-analysis is to synthesize rigorously conducted randomized
controlled trials and quasi-experimental studies on reading interventions for students
who are at risk for reading difficulty in Grades 13. The research questions guiding this
project are:
1. Overall, how effective are reading interventions that are designed to improve the
reading outcomes (i.e., reading of words and pseudowords, passage reading flu-
ency, and reading comprehension) of Grades 13 students who are considered at
risk for reading difficulties?
2. Do study characteristics (i.e., nature of comparison, design, grade level, risk sta-
tus, and outcome domain/type) or intervention characteristics (i.e., group size,
interventionist, average hours per week of intervention, whether the intervention
was scripted, areas of instruction within the interventions, and support provided
to interventionists) moderate the effect of reading interventions on read-
ing outcomes?
404 R. GERSTEN ET AL.
Method
Literature Search and Selection of Relevant Studies for the Meta-Analysis
The goal of the search was to locate all studies published from January 2002 to March
2017 focused on reading interventions for students in Grades 13. The literature search
began with a keyword search of the following databases: Academic Search Premier,
Campbell Collaboration, Educators Reference Complete, ERIC, PsycINFO, Social
Sciences Citation Index, and WorldCat. The following keywords were used: reading, lit-
eracy, fluency, decoding, vocabulary, comprehension, reading ability, reading proficiency,
reading achievement, response to intervention and instruction, reading intervention, RtI,
response to intervention, response to instruction, Tier 2 intervention, tutoring, small-group
instruction, one-on-one instruction, intensive intervention, at-risk students, at-risk, contin-
ued risk, non-responders, responders, reading difficulties, reading disabilities, and strug-
gling readers. In addition, we examined all WWC intervention reports in beginning
reading and two relevant WWC Practice Guides, Assisting Students Struggling with
Reading and Improving Reading Comprehension in Kindergarten Through 3rd Grade. We
performed a version of hand-searching known as snowballing, by checking the reference
lists of research syntheses on the topic. Finally, we solicited recommendations from key
researchers in the field on studies likely to be eligible.
1
Toward the end of the search
and review process, we examined any studies not previously located but included in the
foundational reading practice guide (Foorman et al., 2016) and other meta-analysis and
research syntheses (e.g., Wanzek & Vaughn, 2007; Wanzek et al., 2016).
The search resulted in the identification of 2,423 publications. All studies were
screened for eligibility based on the title, keywords, and abstracts. The studies were then
examined to determine whether they met the following inclusion criteria:
(a) Location. To be eligible, studies had to take place in the United States.
(b) Publication date. We limited the search to studies published between 2002 and
2017. The 2002 start date was chosen because it marks a transition in how teachers
approached reading interventions. Beginning circa 2002, initiatives in states such as
Texas and California (and numerous others) were reinforced by the Reading First pro-
grams (NCLB, 2002) emphasis on small-group preventative reading interventions based
on early screening in the primary grades. Research after this date focused more on the
effectiveness of these preventative interventions. Therefore, studies published after 2002
seemed most relevant to the research questions.
(c) Reading intervention. The study had to focus on the effectiveness of a reading
intervention: that is, preventative instructional practices and activities designed to help
students who are considered at risk for reading difficulties (e.g., Gersten, Compton,
et al., 2009). The interventions had to be at least 8 h in duration and could be provided
to small groups of students or individually to one student. We did not exclude any stud-
ies based on the size of the small groups. The interventions could be conducted at
school, either during school or after school, or at non-school clinics. The intervention
could be conducted during the school year or during summer break. They could be
1
This procedure is documented in Gersten et al. (2017).
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 405
delivered by teachers, researchers, tutors, volunteers, parents, or paraprofessionals, pro-
vided they followed a specific intervention program or a clearly outlined approach.
Although we included interventions that taught phonological awareness, we did not
include interventions that focused solely on phonological awareness without providing
any instruction on reading words and/or pseudowords. We also did not include
whole-class (Tier 1) interventions (even if it was noted that the entire class or school
was considered at risk for reading failure) or intensive Tier 3 interventions that were
meant to meet the individual needs of students who failed to benefit from evidence-
based interventions (e.g., Fuchs, Fuchs, et al., 2008; Gersten, Compton, et al., 2009). In
other words, studies that selected only students who were nonresponders to a Tier 1 or
2 intervention were excluded. Denton et al. (2013), for example, examines the impact of
an intervention for students who have failed to show progress in both Tier 1 and Tier 2
interventions and, therefore, was excluded from the meta-analysis. Finally, we excluded
interventions that were delivered only at home, conducted in a language other than
English, or included only a professional development component for teachers on the
topic and lacked a specific intervention or intervention approach.
(d) Study design. Only RCTs and QEDs were included.
(e) Sample. The participants had to be students in Grades 13 who were considered
at risk for reading difficulties. To be considered at risk, students had to have (a) a score
on a valid screener or screening battery indicating that the student was likely to be at
risk for possible reading failure at the end of the school year or (b) a score on a norm-
referenced standardized test (such as Woodcock Reading Mastery) indicating that the
student performed below the 40th percentile at the beginning of the school year or at
the end of the previous school year. If a study sample included students from grades
that were outside the scope of the review (e.g., Grades K, 4, or 5), then the study had to
meet one of the following criteria: (a) the study findings disaggregated the results of stu-
dents in eligible grades or (b) students in eligible grades represented over 50 percent of
the aggregated mixed-age sample.
(f) Outcomes. Studies had to include outcome measures of reading proficiencies and
skills (i.e., word reading, passage fluency, reading comprehension, or overall reading
achievement). Studies that only included measures of pre-reading skills such as phon-
emic awareness, rhyming, and oral comprehension were excluded.
Of the 2,423 publications that were examined, 54 met the initial criteria for inclusion.
See Figure 1 for a pictorial representation of the screening process.
Coding of Studies
The 54 publications that met initial inclusion criteria were coded in 3 phases. In Phase
1, publications were coded for quality of research design. In Phase 2, publications were
coded to identify study characteristics and intervention characteristics. Finally, in Phase
3, publications were coded to explore the areas of reading covered in the interven-
tion lessons.
406 R. GERSTEN ET AL.
Phase 1 Coding: Quality of Research Design
In the first phase, two members of the research team (who are certified WWC
reviewers) independently examined each publication for the study designs strength and
quality using WWC Procedures and Standards Handbook (Version 3.0; U.S. ED et al., 2013).
Only studies that met WWC standards (with or without reservations) were included in
Phases 2 and 3.
Several publications we reviewed included more than one study (e.g., Denton,
Fletcher, Taylor, Barth, & Vaughn, 2014; Lane, Pullen, Hudson, & Konold, 2009).
For the purposes of this project, we defined a study as any comparison with a
unique treatment group compared to a unique, business-as-usual control condition.
Studies comparing the impacts of two researcher-controlled interventions, as well as
comparisons of variations in treatments, were excluded. For example, in Lane et al.
(2009), researchers report the effects of an intervention, as well as three variations of
that intervention, when compared with that of a business-as-usual control condition.
We considered each intervention and variation as a unique treatment group, and
each comparison of a unique treatment group with a business-as-usual control as a
separate study; therefore, we counted four studies in this publication (i.e., T0 vs. C,
T1 vs. C, T2 vs. C, and T3 vs. C). The comparisons of each variation in treatment
with the others were excluded. Studies of variations in treatments focus on a much
more precise research question than the effectiveness of reading interventions. The
aThe study is a randomized controlled trial with high attrition or a quasi-experimental design
study with analysis groups that are not shown to be equivalent. bThere was only one unit
assigned to at least one of the conditions, or the intervention was always used in combination
with another intervention.
Eligibility
Screening
What Works
Clearinghouse
Evidence Reviews
Included in Meta-
Analysis
Publications screened for eligibility:
2,423
Publications that were excluded because they
compare two unique treatments: 9
Publications and study comparisons
used in meta-analysis:
25 publications include 33 studies
Publications that met screening
criteria: 54
Publications coded for quality: 54
Publications excluded at screening: 2,369
Not conducted in the U.S.
Not published between 2002–2017
No eligible reading intervention
Not an eligible study design, sample, or outcome
Publications that met WWC Evidence
Standards: 34
Publications that did not meet WWC Evidence
Standards: 20
Design qualitya
Confounding factorb
Figure 1. Literature search, screening, and reviewing.
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 407
framework of this meta-analysis could not account for the more experimental manip-
ulations of specific components.
In total, of the 54 publications reviewed, 25 publications included 33 separate studies
that met standards (with or without reservations). See Figure 1.
Phase 2 Coding: Study and Intervention Characteristics
For studies that met WWC group design standards (with or without reservations), we
coded the following study characteristics: nature of the comparison, design (either RCT
or QED), grade level, participantsrisk level, and outcome domain and type. Coding of
the intervention characteristics addressed the following: What was the size of the inter-
vention group, who implemented the intervention (i.e., interventionist), was the inter-
vention scripted, was monitoring and feedback provided to the interventionist, and how
many hours of instruction were provided per week. See Table 1 for the operational defi-
nitions. Two members of the researcher team coded all study and intervention charac-
teristics. The researchers discussed and rectified any discrepancies. After the initial
coding, a third researcher coded a randomly selected 20 percent of the studies for reli-
ability purposes. Reliability was 90.6 percent.
Phase 3 Coding: Area/Focus of Instruction
We examined descriptions of the interventions provided in the publications and cataloged
the interventions in two ways: (a) the target area of instruction, and (b) the focus of instruc-
tion. Each study was examined to determine if any of the following reading areasphono-
logical awareness, decoding, encoding (spelling), fluency, vocabulary, comprehension, and
writingwere addressed during the intervention. If an intervention covered a reading area
in any manner, minimally or extensively, then the study was coded for that area.
During our coding, we noticed that some studies covered the main areas of reading
minimally during the intervention while others gave evidence of extended explicit
instruction. For instance, many studies mentioned that they included reading compre-
hension in the lesson, but then only described that they asked comprehension questions
as or after students read a passage, without providing any explicit instruction in com-
prehension strategies. Consequently, instruction in each intervention was further exam-
ined to determine whether the reading areas were taught routinely and explicitly. If so,
the studies were also coded for having a focus in that area of reading. This additional
level of codingthe focus of instructionwas limited to decoding, fluency, reading
comprehension, and vocabulary.
An example to illustrate how the research team determined whether a component
was a focus of instruction is the coding of the reading comprehension component in
Denton et al. (2014). This study consisted of two treatment conditions, explicit instruc-
tion and guided reading, and a control group. Comprehension was coded as a focus of
instruction in the explicit instruction condition but not the guided reading condition. In
the former, instruction consisted of teachers modeling comprehension strategies using
think-aloudsand providing specific feedback when students practiced in small groups.
Although the guided reading condition included discussion activities, teachers never
408 R. GERSTEN ET AL.
modeled or provided any clear guidance on how and when to use various strategies to
discern a cause-effect relationship or for succinct retelling.
Coding of studies during this phase was done collaboratively by two members of the
research team (who are experts in beginning reading). Reliability was calculated on 20
percent of randomly selected studies. Reliability was 85.71 percent.
Data Analysis
Calculation of Effect Sizes
To determine each interventions impact, we calculated the average effect size for
each domain of reading (i.e., word and pseudoword reading, passage reading
Table 1. Study and intervention characteristic definitions.
Moderator Levels and operational definitions
Design RCT ¼Randomized control Trial; QED ¼Quasi-experimental design.
Grade level
a
1¼First grade only; 2/3 ¼Second- and third-grade combination class.
Nature of the comparison group Core reading instruction only ¼Business-as-usual, whole-class reading
instruction with no additional support (i.e., Tier 1); School-provided
intervention ¼Reading interventions typically provided by the
school/district (i.e., some form of preventative intervention provided
in addition to core Tier 1 reading).
Participantsrisk level
b
At risk ¼Only students in the 25th percentile or lower on a
standardized norm-referenced screener; Minimal risk ¼Students
considered potentially at risk who score below the 40th percentile
on a standardized norm-referenced screener.
Outcome measure domains Word or pseudoword reading ¼e.g., TOWRE &Woodcock-Johnson Word
Attack; Passage reading fluency ¼e.g., AIMSweb Standard Reading
Assessment Passages; Reading comprehension ¼Woodcock Reading
Mastery Tests (WRMT) Passage Comprehension subtest, GRADE
reading comprehension subtest.
Measure type
c
Standardized tests ¼Existing measures administered, scored, and
interpreted in the same way for all test-takers; Researcher-developed
measures ¼Only those the researcher developed for the study.
Group size Small group ¼groups of more than one student; One-on-one ¼1
student with 1 interventionist.
Interventionist
d
Certified teacher ¼Had a teaching credential, even if they were not
employed as a full-time teacher at the schools where the studies
took place; Paraprofessional ¼Anyone who worked or volunteered at
the school as part of the study and had no teaching credential;
Research staff ¼Were typically graduate students at a university.
Avg. hrs./week of instruction Low ¼Less than 1.5 h per week; Medium ¼1.5 to 2.0 h per week; High
¼2.0 or more hours per week.
Scripted Yes ¼Interventionist provided with step-by-step instructions on what to
say and do during each session; No ¼Interventionist was not
provided with step-by-step instructions.
Monitoring and feedback Yes ¼Interventionists were observed conducting the intervention and
were provided feedback after they were observed;
No ¼Interventionists were not observed and no feedback
was provided.
a
If a study included students from more than one grade (e.g., from Grades 1 and 2), then the study was assigned to
the grade level of the majority of the sample.
b
If the authors did not provide a percentile on a nationally normed test to describe the at-risk sample, the study was
not coded for this variable.
c
Measures of pre-reading skills such as phonological awareness, rhyming, and letter naming were excluded, as were
measures of listening comprehension, spelling, and writing.
d
Studies with a mix of interventionists (e.g., both teachers and paraprofessionals) were coded by the most preva-
lent category.
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 409
fluency, vocabulary, and reading comprehension). The effect size was calculated for
each outcome using the means and pooled standard deviations for intervention and
comparison groups, and corrected for small sample bias by Hedges (1981) proce-
dures. In cases where means and standard deviations were not available, the tor F
statistics and the treatment and comparison group sample sizes were used to calcu-
late Hedgesg.
Meta-Analytic Procedures
To account for dependencies in our data, we used random effects robust variance
estimation (RVE) techniques (Hedges et al., 2010). RVE permits the comparison of
effect sizes across studies in which multiple, dependent effect sizes are drawn from
the same sample. Random effects analyses were conducted using the statistical soft-
ware Stata (StataCorp, 2015) and the Robumetapackage (Hedberg, 2011), a
macro that applies the RVE techniques. In RVE, the mean correlation between all
pairs of effect sizes within a study (q) must be specified to estimate the study
weights and calculate the between-study variance. We used a qvalue of .80 to esti-
mate the between-study variance and then conducted sensitivity analyses using q
values of 0 to .90.
2
The small-sample correction developed by Tipton (2015) was
implemented in Robumeta for all models, as RVE results have been shown to
inflate the Type I error rate when the meta-analysis includes fewer than 40 studies
(Tipton, 2015).
We estimated a series of meta-regression models using RVE. First, we ran an inter-
cept-only model in which the estimate for the constant represented the average weighted
effect size across all 33 studies (Tanner-Smith & Tipton, 2014). The Robumeta package
calculated the following indices of heterogeneity: Qstatistic and its p-value, estimates of
I
2
(the percentage of between-study heterogeneity not due to chance variation in
effects), and s
2
(the true variance in the population of effects).
Next, we examined the role of moderators such as group size, grade level, and level
of support provided to interventionists. Though one meta-regression model with covari-
ates for all moderators is preferable, the number of studies that met the inclusion crite-
ria was small, and not every study included information that permitted coding of all
moderators. Thus, this approach was not taken because results would be uninterpretable
due to insufficient degrees of freedom. Instead, we examined potential moderators using
separate RVE meta-regression models with only the moderator of interest entered as a
predictor. We interpret these results with caution due to potential confounding effects
of other moderators that are unaccounted for in these single-predictor models.
Moreover, a small number of single-predictor models remained underpowered (df <4),
likely a result of large imbalances in the data (Tipton, 2015).
The moderator variables were dummy coded and included as covariates in each
model. To estimate a mean effect size for each level of the moderator variables (i.e.,
2
Hedges et al. (2010) demonstrated that the value selected for
q
generally does not affect results much and
recommended implementing a sensitivity analysis by analyzing models with varying
q
values. We conducted sensitivity
analyses using
q
values of 0 to .90 and found no meaningful differences in the results across models, indicating that
our findings were robust across estimates of
q.
410 R. GERSTEN ET AL.
RCT and QED are levels of the design moderator variable), intercept-only models also
were run for each level of the moderator. The p-value for determining statistical signifi-
cance in each of the moderator analyses was set to p<.05.
Publication Bias
The potential impact of publication bias using the trim-and-fill methodology (Duval &
Tweedie, 2000) was examined by constructing a funnel plot of the effect sizes and not-
ing any asymmetry in the distribution of effects. The plot was then systematically
trimmed, removing the effect sizes causing the asymmetry, and filled in with any effect
sizes that may have been missing from unpublished studies that resulted in small and
non-significant treatment effects. The analysis estimated the number of missing effect
sizes and recalculated the overall mean effect size in a way that reflects the presence of
these missing effects.
Results
Study Characteristics
A total of 33 studies from 25 publications met WWC group design standards and were
included in the final meta-analysis. These 33 studies spanned 13 years (20042016) and
provided a total of 128 effect sizes. The sample sizes in the studies ranged from 21 to
6,888 students. Total sample size across all studies was 11,737 students (median sample
size ¼89).
Thirty of the studies were RCTs; the remaining three were QEDs. The comparison
condition for the majority of studies (k¼21) was Tier 1 core classroom instruction (i.e.,
nothing other than what the classroom teacher chose to provide). In the remaining 12
studies, the comparison was typical school- or district-provided intervention. Twenty-
two studies were conducted in Grade 1, and 11 studies were in Grades 2 and/or 3. Only
16 studies provided the information necessary for coding for participantsrisk level.
3
Of
those, 12 studies included students in the minimal risk category, and only 4 studies
included students in the at-risk category (i.e., 25th percentile or lower). The most com-
mon outcome domain was word or pseudoword reading (included in all but one of the
33 studies). Nineteen studies included outcomes in reading comprehension, 16 included
passage reading fluency outcomes, and only two studies included outcomes in vocabu-
lary. The study characteristics for each study included in the meta-analysis are presented
in Table 2.
Intervention Characteristics
Interventions were delivered to students in one-on-one settings in 21 of the studies and
in small-group settings (group sizes in the included studies ranged from 2 to 5 students)
3
If the authors did not provide a percentile on a nationally normed test to describe the at-risk sample, the study was
not coded for this variable. Across the studies, a wide range of screening measures and operational definitions were
used, and the screeners were typically not nationally normed.
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 411
Table 2. Study and intervention characteristics for each study included in the meta-analysis.
Study characteristics Intervention characteristics
Author Design
b
Comparison
c
Grade At-risk level
d
Outcome domain
e
Grouping
f
Interventionist
g
Scripted
h
Avg. hrs./wk.
i
Monitoring
j
Allor and McCathren (2004) Study 1 RCT CR 1 RC 1:1 P Y L Y
Allor and McCathren (2004) Study 2 RCT CR 1 PF, RC, WR 1:1 P Y L Y
Berninger, Abbott, Vermeulen, and Fulton (2006) RCT CR 2/3 A WR SG R N M N
Blachman et al. (2004) RCT SI 2/3 A PF, RC, WR 1:1 CT N H Y
Case et al. (2010) RCT CR 1 WR SG R Y M Y
Case et al. (2014) RCT CR 1 PF, WR SG R Y M Y
Denton et al. (2010) RCT SI 1 RC, WR SG CT N H Y
Denton et al. (2014) RCT SI 2/3 M PF, RC, WR SG P Y H Y
Denton et al. (2014) RCT SI 2/3 M PF, RC, WR SG P Y H Y
Fien et al. (2015) RCT SI 1 M PF, WR SG P N H Y
Fuchs, Compton, Fuchs, Bryant, and Davis (2008) RCT CR 1 WR SG R N H N
Gunn et al. (2005) RCT CR 2/3 PF, RC, WR SG P Y M Y
Jacob, Armstrong, Bowden, and Pan (2016) RCT SI 2/3 PF, RC, WR 1:1 P Y L Y
Jenkins, Peyton, Sanders, and Vadasy (2004) QED CR 1 A RC, WR 1:1 P Y M Y
Lane et al. (2009) RCT CR 1 M WR 1:1 R N M N
Lane et al. (2009) RCT CR 1 M WR 1:1 R N M N
Lane et al. (2009) RCT CR 1 M WR 1:1 R N L N
Lane et al. (2009) RCT CR 1 M WR 1:1 R N L N
May, Sirinides, Gray, and Goldsworthy (2016) RCT SI 1 RC, WR 1:1 CT N H N
OConnor et al. (2010) RCT CR 2/3 PF, RC, WR 1:1 P N L N
OConnor et al. (2010) RCT CR 2/3 PF, RC, WR 1:1 P N L N
Pullen, Lane, and Monaghan (2004) RCT CR 1 M WR 1:1 P N L N
Scanlon, Vellutino, Small, Fanuele, and Sweeney (2005) RCT SI 1 RC, WR 1:1 CT H Y
Scanlon et al. (2005) RCT SI 1 RC, WR 1:1 CT H Y
Schwartz (2005)
a
RCT CR 1 RC, WR 1:1 CT N H N
Smith et al. (2016) RCT SI 1 M PF, WR SG P N H Y
Vadasy and Sanders (2011)
a
RCT CR 1 PF, RC, WR 1:1 P Y L Y
Vadasy, Sanders, and Peyton (2006) QED
a
QED CR 2/3 M PF, RC, WR 1:1 P Y M Y
Vadasy et al. (2006) RCT
a
RCT CR 2/3 M PF, RC, WR 1:1 P Y M Y
Vadasy et al. (2007)
a
RCT CR 2/3 M PF, WR 1:1 P Y M Y
Vellutino and Scanlon (2002) RCT SI 1 A WR 1:1 CT N H Y
Wang and Algozzine (2008) RCT CR 1 RC, WR SG P Y L N
Wanzek and Vaughn (2008)
a
QED SI 1 PF, WR SG R Y H Y
a
Identifies studies also included in Wanzek et al. (2016).
b
RCT ¼randomized controlled trial. QED ¼quasi-experimental design.
c
CR ¼core reading instruction. SI ¼school-provided intervention.
d
A¼at risk, a sample with students only in the 25th percentile or lower. M ¼minimal risk, a sample that included students below the 40th percentile. Blank ¼authors did not provide
a percentile or the information to calculate the percentile from a nationally normed test given for screening purposes.
e
WR ¼Word & Pseudoword Reading. RC ¼Reading Comprehension. PF ¼Passage Fluency.
f
SG ¼small group.
g
P¼paraprofessional. R ¼researcher. CT ¼certified teacher.
h
Y¼scripted. N ¼not scripted.
i
L¼low, M ¼medium, H ¼high.
j
Y¼support included. N indicates that monitoring and feedback was either not included or not reported.
412 R. GERSTEN ET AL.
in 12 studies. The interventions were implemented by a researcher in nine studies, a
certified teacher in seven studies, and a paraprofessional in 15 studies. In 21 studies,
additional support in the form of monitoring and feedback was provided to the inter-
ventionists. Nearly half of the studies (k¼15) included scripted interventions. The aver-
age hours per week of intervention ranged from less than an hour (i.e., 45 min) to
4.17 h per week (median ¼2). The intervention characteristics for each study included
in the meta-analysis are described in Table 2.
Instructional Area/Focus of the Intervention
All the studies included in the meta-analysis examined interventions that focused on
building studentsreading skills in more than one instructional area (e.g., decoding, flu-
ency, comprehension). In many respects, the interventions appeared to be similar to
each other, mainly addressing decoding and fluency, while also attending to one or
more areas of reading instruction, such as encoding, comprehension, vocabulary, phono-
logical awareness, and writing.
All but two studies (k¼31) addressed decoding. Many studies also included instruc-
tion in passage fluency (k¼29) and encoding (k¼23). Over 50 percent of the studies
addressed phonological awareness (k¼19) and comprehension (k¼18). Vocabulary
(k¼13) and writing (k¼7) were addressed less frequently.
All studies with decoding and fluency were coded for both area of instruction as well
as the focus of instruction. Only nine of the 18 studies that were coded for comprehen-
sion were also given a Yes under the focus code as they showed evidence of systematic
teacher-led explicit instruction in comprehension that went beyond asking literal ques-
tions, questions about title and pictures, or monitoring comprehension strategies. Only
one of the nine studies coded for vocabulary was also given a Yes for focus. Vocabulary
instruction rarely involved explicit teaching and interaction; it typically involved defin-
ing words if students asked, or asking students to look at pictures to derive the meaning
of a word.
Meta-Analytic Results
The meta-analysis included 128 effect sizes from 33 studies. See Table 3 for effect sizes
(Hedgesg) for all outcome measures by domain for each study. Effect sizes ranged
widely, from 0.20 to 1.37. The mean effect size for these studies was 0.39 (SE ¼.04,
p<.001, 95% CI [0.32, 0.46]), indicating that the reading interventions were generally
effective across students, settings, and measures. As expected, treatment effects varied
considerably. The I
2
estimate of the percentage of between-study heterogeneity not due
to chance was 50.75%, with a s
2
estimate of the true variance in the population of
effects of 0.02.
Moderators
Eleven categorical moderator analyses of study characteristics (six variables) and inter-
vention characteristics (five variables) were individually tested using each as a single
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 413
Table 3. Outcomes and effect sizes.
Author Total NOutcome domain
a
Measure type
b
ES
c
Allor and McCathren (2004) Study 1 86 RC S 0.50
Allor and McCathren (2004) Study 2 157 WR S 0.05, 0.13, 0.33, 0.44, 0.78
RC S "0.16
PF R 0.13
Berninger et al. (2006) 93 WR S 0.35
Blachman et al. (2004) 69 WR S 0.74, 0.87
RC S 0.53
PF S 0.70
Case et al. (2010) 30 WR S 0.48, 0.73, 0.73
Case et al. (2014) 123 WR S 0.02, 0.17, 0.21, 0.23
PF S 0.20
Denton et al. (2010) 422 WR S 0.42
RC S 0.51
Denton et al. (2014) 103 WR S 0.34, 0.40, 0.50
RC S 0.08, 0.13
PF S 0.16
Denton et al. (2014) 112 WR S 0.31, 0.50, 0.63
RC S 0.29, 0.46
PF S 0.45
Fien et al. (2015) 239 WR S 0.38, 0.45
PF S 0.30
Fuchs et al. (2008) 64 WR S 0.26, 0.26, 0.38, 0.46, 0.65
Gunn et al. (2005) 245 WR S 0.30, 0.52
RC S 0.32
PF S 0.24
Jacob et al. (2016) 1,166 WR S 0.11
RC S 0.10
PF S 0.09
Jenkins et al. (2004) 99 WR S 0.37, 0.50, 0.52, 0.73, 0.76, 1.12
RC S 0.74
Lane et al. (2009) 41 WR R 0.64, 0.71
WR S 1.24
Lane et al. (2009) 42 WR R 0.24, 0.29
Lane et al. (2009) 43 WR R 0.39, 0.55
Lane et al. (2009) 46 WR R 0.52, 0.59
WR S 1.02
May et al. (2016) 6,888 WR S 0.41
RC S 0.42
OConnor et al. (2010) 40 WR S 0.10, 0.56
RC S 0.48, 0.53
PF S 0.60, 0.75, 0.76, 0.87
OConnor et al. (2010) 43 WR S 0.25, 0.57
RC S 0.37, 0.44
PF S 0.81, 0.84 0.93, 1.33
Pullen et al. (2004) 47 WR R 0.24, 0.81
WR S 0.54, 0.59
Scanlon et al. (2005) 114 WR S 0.31, 0.55
RC S 0.41
Scanlon et al. (2005) 117 WR S 0.51, 0.62
RC S 0.35
Schwartz (2005) 74 WR S 0.93, 1.37
RC S 0.14
Smith et al. (2016) 743 WR S 0.19
PF S 0.12
Smith et al. (2016) 729 WR S 0.23, 0.32
Smith et al. (2016) 749 WR S 0.24
PF S 0.18
Vadasy and Sanders (2011) 89 WR S 0.51
RC S 0.29
PF R 0.69
Vadasy et al. (2006) QED 31 WR S 0.61, 0.72
(continued)
414 R. GERSTEN ET AL.
predictor in the meta-regression models. Some of these variables emerged as significant
moderators of the relationship between reading interventions and effect sizes on meas-
ures of studentsreading proficiency.
Findings are summarized in Table 4. The coefficients from the intercept-only models
should be interpreted as the weighted effect size for studies with that level of moderator,
and statistically significant results indicate that the mean effect size for studies with that
level of the moderator is significantly different from zero.
Study Characteristics. The outcome domain for the measures, when comparing the
three areas of reading
4
(word or pseudoword reading, reading comprehension, passage
reading fluency) significantly moderated effect size. On average, outcomes in the word
or pseudoword reading domain yielded the largest effect size (b¼0.41 [0.330.50],
p<.001, k¼32). Reading comprehension domain outcomes produced a slightly smaller
effect (b¼0.32 [0.200.43], p<.001, k¼19), followed by passage reading fluency out-
comes, which generated the smallest effect size (b¼0.31 [0.170.44], p<.001, k¼16).
The only significant difference, however, was on outcomes in the word or pseudoword
reading domain, which yielded significantly larger effect sizes than outcomes in the
domains of reading comprehension or passage reading fluency (b¼0.10 [0.000.19],
p¼.049, k¼32)
Study characteristic variables that did not significantly moderate the effect size
included grade level (b¼0.06 ["0.260.15], p¼.542, k¼33), research design (b¼"0.06
["1.030.92], p¼.823, k¼33), the nature of the comparison group (b¼"0.11
["0.250.04], p¼.139, k¼33, participantsrisk level (b¼0.13 ["0.180.44], p¼.330,
k¼16), and standardized measures versus researcher-developed measures (b¼0.11
["0.070.28], p¼.189, k¼33).
Table 3. Continued.
Author Total NOutcome domain
a
Measure type
b
ES
c
RC S 0.50
PF R 0.81
Vadasy et al. (2006) RCT 21 WR S 0.67, 0.75
RC S 0.21
PF R 0.55
Vadasy et al. (2007) 43 WR S 0.47
PF S 0.52
Vellutino and Scanlon (2002) 118 WR S 0.38
Wang and Algozzine (2008) 139 WR S "0.03, 0.39, 0.45
RC S 0.17
Wanzek and Vaughn (2008) 50 WR S 0.12, 0.18
PF S "0.20
a
WR ¼Word & Pseudoword Reading. RC ¼Reading Comprehension. PF ¼Passage Fluency.
b
R¼researcher developed. S ¼standardized.
c
One effect size per measure. If four effect sizes are listed, it means four measures were used in this outcome domain.
4
The analysis could not include a meta-analysis of effects from outcomes in the vocabulary domain due to the small
number of studies (k¼2).
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 415
Intervention Characteristics. None of the intervention characteristics led to significant
moderator effects. These included interventions implemented by researchers (b¼"0.01
["0.210.20], p¼.939, k¼33), interventions implemented by certified teachers
(b¼0.14 [0.000.28], p¼.053, k¼33), interventions implemented by a paraprofessional
(b¼"0.12 ["0.250.02], p¼.086, k¼33), if an intervention was scripted (b¼"0.13
["0.280.02], p¼.094, k¼31), whether or not monitoring and feedback was provided
(b¼"0.12 ["0.260.03], p¼.103, k¼33), or average hours per week of intervention
for the low (b¼"0.08, ["0.300.15], p¼.464, k¼33), medium (b¼0.04, ["0.140.23],
p¼.610, k¼33), or high (b¼0.03, ["0.130.18], p¼.722, k¼33) categories, and
grouping (either small group or one-on-one interventions) (b¼"0.13 ["0.270.01],
p¼.075, k¼33). When tested per grade level, however, grouping was a significant mod-
erator for Grade 1 but not Grades 2 and 3. For Grade 1 specifically, effects were larger
if the intervention was delivered to students individually rather than to groups of stu-
dents, (b¼"0.16 ["0.320.01], p¼.042, k¼22).
Area/Focus of Instruction. Given that all the studies examined interventions with mul-
tiple areas of instruction, we tested a meta-regression analysis model using all seven
areas of instruction. Analyzing the areas of instruction simultaneously allowed us to
examine each areasmoderatinginfluencewhileholdingtheotherareasconstant.Of
the seven areas of instruction, phonological awareness, encoding (spelling), and writ-
ing appeared to be significant moderators of effect sizes when holding all other areas
of instruction constant (see Table 4). Interventions that included phonological aware-
ness tended to result in smaller effects across the word or pseudoword reading, read-
ing comprehension, and passage reading fluency outcomes (b¼"0.19 ["0.32, "0.05],
p¼.010). However, this study did not specifically address phonological awareness out-
comes, so we do not know the impact on those. In contrast, providing instruction in
encoding (b¼0.18 [0.01, 0.35], p¼.045) or writing (b¼0.18 [0.02, 0.34], p¼.028)
yielded significantly higher effect sizes when they were included as a component in
the intervention.
We were also interested in whether studies that included a focus in a particular
areathat is, more in-depth and explicit instructionsignificantly moderated impacts.
Effect sizes were not significantly associated with providing a more in-depth instruc-
tional focus in decoding (b¼"0.17 ["0.890.55], p¼.216, k¼33), fluency (b¼"0.04
["0.340.26], p¼.694, k¼33), or comprehension (b¼"0.06 ["0.300.17], p¼.572,
k¼33). Note that vocabulary was not examined here as a focus due to the limited num-
ber of studies.
Publication Bias
Finally, to determine whether the findings suffered from publication/small-study
bias, we implemented the trim-and-fill method (Duval & Tweedie, 2000). The
results indicated that 23 effect sizes were estimated missing from the current meta-
analysis of 128 total effects. Including these in the random-effects model would
minimally decrease the mean effect size from g¼0.39 to g¼0.32 (p<. 001, 95%
CI [0.27, 0.36]).
416 R. GERSTEN ET AL.
Table 4. Moderator analysis.
Moderator Coeff SE 95% CI p df QI
2
τ
2
nkρ
Grade level
1 vs. 2/3 0.06 0.09 (0.26, 0.15) 0.542 12 54.82 41.63 0.02 128 33 .8
Grouping
Small Group vs. 1:1 0.13 0.07 (0.27, 0.01) 0.075 20 64.31 50.24 0.02 128 33 .8
Research design
RCT vs. QED 0.06 0.23 (1.03, 0.92) 0.823 <4 66.80 52.10 0.02 128 33 .8
Interventionist
Paraprofessional vs. Other 0.12 0.06 (0.25, 0.02) 0.086 19 49.68 35.59 0.01 128 33 .8
Certified Teacher vs. Other 0.14 0.06 (0.00, 0.28) 0.053 8 50.87 37.09 0.02 128 33 .8
Researcher vs. Other 0.01 0.09 (0.21, 0.20) 0.939 10 67.00 52.24 0.02 128 33 .8
Scripted intervention
Scripted vs. Non-Scripted 0.13 0.07 (0.28, 0.02) 0.094 18 50.53 40.63 0.02 122 31 .8
Nature of the comparison group
SI vs. CR
a
0.11 0.07 (0.25, 0.04) 0.139 24 65.88 51.43 0.02 128 33 .8
At-risk sample
At risk vs. Minimal risk 0.13 0.12 (0.18, 0.44) 0.330 5 16.24 0.01 58 16 .8
Hours of treatment
Low vs. Other 0.08 0.10 (0.30, 0.15) 0.464 10 53.19 39.84 0.02 128 33 .8
Medium vs. Other 0.04 0.08 (0.14, 0.23) 0.610 10 66.89 52.16 0.02 128 33 .8
High vs. Other 0.03 0.07 (0.13, 0.18) 0.722 21 57.57 44.42 0.02 128 33 .8
Provided monitoring/feedback
Yes vs. No 0.12 0.07 (0.27, 0.03) 0.103 12 54.27 41.04 0.02 128 33 .8
Measure type
Standardized vs. Researcher 0.11 0.07 (0.07, 0.28) 0.189 7 66.36 51.78 0.02 128 33 .8
Outcome domain
Word/Pseudoword Reading vs. Other 0.10 0.05 (0.00, 0.19) 0.049 19 66.10 51.59 0.02 128 33 .8
Passage Reading Fluency vs. Other 0.09 0.06 (0.23, 0.05) 0.188 10 60.01 46.68 0.02 128 33 .8
Reading Comprehension vs. Other 0.05 0.05 (0.23, 0.05) 0.308 13 66.58 51.94 0.02 128 33 .8
Instructional Area
Constant 0.49 0.15 (0.02, 1.00) 0.055 <4 44.07 27.39 0.02 128 33 .8
Decoding 0.12 0.11 (0.53, 0.29) 0.356 <4 .8
Passage fluency 0.00 0.10 (0.27, 0.26) 0.974 5 .8
Reading comprehension 0.08 0.07 (0.23, 0.06) 0.234 13 .8
Vocabulary 0.08 0.09 (0.12, 0.27) 0.414 11 .8
Phonological awareness 0.19 0.06 (0.32, 0.05) 0.010 13 .8
Encoding 0.18 0.07 (0.01, 0.35) 0.045 8 .8
Writing 0.18 0.07 (0.02, 0.34) 0.028 8 .8
Note. Coeff = coefficient; SE = standard error; CI = confidence interval; p= significance; df = degrees of freedom; Q = test of homogeneity of effect sizes; I
2
= measures of effect size
variability; τ
2
= between-study variance; n= number of effect sizes; k= number of studies; ρ= corrected correlation. In all RVE models, we used a ρvalue of .80 to estimate the
between-study variance; = could not be estimated. Bolded coefficient values indicate statistically significant estimates at p< .05.
a
CR = core reading instruction. SI = school-provided intervention.
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 417
Discussion
Results from this meta-analysis of 33 studies of reading interventions conducted
between 2002 and 2017 reveal significant, positive effects on a range of reading out-
comes. The significant mean effect size (Hedgesg) across 33 studies was 0.39 (p<.001),
indicating that students from Grades 1, 2, and 3 who score in the at-risk category on a
screening battery or on a normed test do, on average, benefit from the set of reading
interventions studied. This leads us to conclude that the research base underlying read-
ing interventions is sound and not the primary reason for the lack of impacts in the
national RtI evaluation (Balu et al., 2015), which found null, or in one case negative,
impacts on reading outcomes for students at or near the cut point on screening.
Mean effect sizes (Hedgesg) for each outcome domain ranged from 0.41 in the area
of word or pseudoword reading to 0.32 in comprehension and 0.31 in passage reading
fluency. All were statistically significant at p<.001. Note that the mean effect size was
the highest in the outcome domain of word and pseudoword reading. This is unsurpris-
ing, given the large body of evidence supporting the use of various forms of systematic,
explicit, small-group instruction in phonemic awareness, phonics instruction, and sight
word reading to help students who are likely to fall behind when experiencing more
traditional instruction (e.g., Gersten, Compton, et al., 2009; National Institute of Child
Health and Human Development [NICHD], 2000).
The reading interventions examined showed many commonalities. Every intervention
addressed multiple aspects of foundational readingphonological awareness, decoding,
passage reading fluency, encoding (spelling) and, on occasion, writing. Nearly all inter-
ventions addressed comprehension in some fashion, although few provided much in the
way of detail. Vocabulary and comprehension instruction were rarely emphasized.
Virtually all interventions included systematic, explicit instruction. Typically, this
occurred during phonics/word reading skills and passage reading fluency, often with
some activities geared toward fluency building and phonological awareness.
Interventions that included instruction on phonological awareness were associated
with significantly smaller effects, whereas interventions that addressed encoding or writ-
ing yielded significantly higher effect sizes. Perhaps focusing on pre-reading skills
such as phonological awareness after students have started to learn to decode is coun-
ter-productive as it takes time and focus away from gaining proficiency in decoding
skills. We speculate that an encoding component may help reinforce phonics rules and
decoding, and we note that this has been a feature of some core reading programs and
intervention programs.
Variables for Future Exploration
The percentage of between-study heterogeneity not due to chance was 50.75%, suggest-
ing both a good deal of variance in the pattern of effects and the need to use moderator
analyses to begin to understand salient factors. Although many of the moderators
explored in the current meta-analysis were non-significant (p>.05), future research is
needed to explore aspects of the interventions that may moderate the relationship
between the intervention and reading outcomes. In particular, researchers should
418 R. GERSTEN ET AL.
continue to investigate variables that could provide us with possible explanations for the
null and negative impacts from the Balu et al. (2015) study.
One finding worth exploring further is whether the interventionist moderates student
impacts on measures of reading achievement. Interventions implemented by certified
teachers did not yield significantly higher effect sizes (p¼.053) than those conducted by
others (primarily paraeducators, university students working for a researcher). Yet,
results of the Balu et al. (2015) survey revealed that teachers provided intervention in
over a third of schools implementing RtI in Grades 13. Similarly, the effect sizes for
interventions delivered by paraprofessionals did not differ significantly from those deliv-
ered by certified teachers or researchers (p¼.086) in our analyses. These results conflict
with those reached by Slavin et al. (2011) in an earlier review of literature on reading
interventions, which suggests this is an area that warrants further investigation.
The results from the Balu et al. (2015) study also suggest that schools implementing
RtI often used small groups ranging from 2 to 10 students, as opposed to the interven-
tions in the meta-analysis, which were implemented in smaller groups (2 to 5 students)
or one-on-one. We found an average effect size of 0.46 for interventions that were deliv-
ered one-on-one and 0.31 for those delivered to small groups of students; however, this
moderator variable was not statistically significant (p¼.075). Further analyses revealed
that grouping moderated effects for Grade 1 but not for Grades 2 and 3 (p¼.042).
One-on-one instruction may be more beneficial for beginning readers. Similarly, even in
small groups of 25 students, it may be easier to meet studentsneeds if all the students
in the group are similar in their basic knowledge of rhyming, alphabet, phonemes, and
decoding skills. A recent study by Al Otaiba, Connor, et al. (2014) supports this notion.
They found that it was necessary to make small groups more homogeneous by adjusting
both the texts readability level and the lesson pacing to meet studentsindividual needs.
It could also be that the schools in the Balu et al. (2015) study used more scripted
interventions, though we cannot know for sure since the Balu survey did not ask
whether scripted interventions were used. Previous research has indicated that programs
where teachers are given some autonomy tend to produce higher results in reading
comprehension (Fang, Fu, & Lamme, 2004; Tivnan & Hemphill, 2005; Wilson, Martens,
& Arya, 2005). One reason for this might be that scripted interventions leave little room
for even slight adaptations to meet individual student needs when compared to those
with a lesson plan and no exact wording. Our results, however, found the effect sizes
for research interventions that allowed teachers to adapt the intervention to students
needs did not differ significantly from scripted interventions (p¼.094). Future research
should explore this area.
Before overgeneralizing from these findings, it is important to note that other var-
iables may be confounding the relationship. For example, all but 3 of the 15 scripted
interventions were implemented by paraprofessionals. Typically, paraprofessionals
implement scripted programs because most do not have the training to make appro-
priate instructional decisions when using a traditional lesson plan. Because our lim-
ited number of studies hindered our ability to model all the potential moderators at
once (i.e., controlling for other variables), the moderator findings should be inter-
preted with caution.
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 419
Relation to Previous Relevant Meta-Analysis
It is difficult to draw a direct comparison between the current study and the Slavin
et al. (2011) and Wanzek et al. (2016,2018) meta-analyses. Though the studies included
in this meta-analysis overlap some of the studies included in the other meta-analyses,
this meta-analysis is the first to use rigorous standards of evidence in the inclusion cri-
teria. The other meta-analyses included studies that were not as rigorous as those in the
current study and included kindergarten interventions, which typically focus heavily on
reading-related skills such as phonological awareness, rhyming, and basic decoding.
The impacts in the current meta-analysis (0.39) are smaller than several impacts in
the Wanzek et al. (2016) meta-analysis of studies of shorter interventions: 0.54 on stand-
ardized foundational skill measures, 0.62 for non-standardized foundational skill meas-
ures, and 1.02 for non-standardized multicomponent measures. The differences in the
magnitude of effect sizes may be due to studies that were not as rigorous as those in the
current study or to the inclusion of kindergarten interventions. However, in the Wanzek
et al. (2016) study, domain level impacts were reported for composite domainsfounda-
tional reading/reading-related skills (including phonological awareness, rhyming, letter
identification, as well as measures of decoding; 0.54 to 0.62)and multicomponent
measures (a composite of listening and reading comprehension; 0.36 to 1.02), which
makes it difficult to compare against our domain-level impacts, which ranged from 0.31
to 0.41.
Yet, effects in the Wanzek et al. (2018) meta-analysis of studies of longer interven-
tions, and the impacts of one-on-one and small group interventions in Slavin et al.
(2011), and the impacts on standardized multicomponent measures in Wanzek et al.
(2016) are similar to those found in our analysis. These findings suggest consistency in
the impact of reading interventions for struggling readers.
Challenges and Limitations in Conducting the Meta-Analysis
Issues in Using Rigorous Design Standards and Contemporary Meta-Analytic Techniques
A unique feature of this meta-analysis is that it included only those studies that met
what is often called the gold standard, What Works Clearinghouse (WWC 3.0) standards
for RCTs and quasi-experimental design. Ninety-one percent of the studies included in
this meta-analysis were RCTs (k¼30), which is a much higher proportion of RCTs
than in similar previous meta-analyses (e.g., Wanzek et al., 2016 [55% RCTs]; Swanson,
1999 [47.9% RCTs]). Including only those studies that met these rigorous standards
allows for more confidence in the meta-analytic findings. This is an especially important
contemporary issue given the lack of replicability of findings in the social sciences
(Ioannidis, 2005) and the general concern about false positives in both individual
research studies (Benjamini & Hochberg, 1995) and meta-analyses (Greco, Zangrillo,
Biondi-Zoccai, & Landoni, 2013).
The gain in trustworthiness, however, resulted in less statistical power for analyses,
including the crucial moderator analyses, which help in understanding possible underly-
ing themes in the data, as invariably, fewer studies met the rigorous design standards in
this meta-analysis. This is likely to become an issue in future meta-analyses, as the
420 R. GERSTEN ET AL.
tradeoff between the quality and validity of the research findings conflicts with the need
for a large number of studies in conducting important moderator analyses with suffi-
cient power. We suspect it will take some time for the field to produce enough high-
quality studies to result in statistically significant findings from which we could draw
conclusions across studies.
As studies most often contain more than one outcome measure and at times more
than one comparison, meta-analyses must address the dependencies arising from such
multiple outcomes and comparisons within studies. This issue was pertinent for our
meta-analysis, as 90% of the studies included multiple measures and 15% contained
multiple comparisons. Thus, to account for the dependencies in the data, we used the
Robust Variance Estimation (RVE; Hedges et al., 2010), a contemporary statistical tech-
nique. One problem in using the RVE is that it results in low statistical power rates
unless the meta-analysis includes a large number of studies (L!
opez-L!
opez, Van den
Noortgate, Tanner-Smith, Wilson, & Lipsey, 2017). Tipton (2015) notes that at least 40
studies are needed for adequate statistical power to conduct the moderator analyses.
Our meta-analysis included 33 high-quality experimental and quasi-experimental studies
(meeting WWC standards), a number not typically seen for a topic as specific as this in
other areas of educational research. However, it still fell below the minimum number of
40 studies for adequate statistical power to conduct the moderator analyses.
A meta-regression model with all moderators entered simultaneously (e.g., Gersten,
Chard, et al., 2009; Wanzek et al., 2016) would have been preferable to our series of
analyses, which tested each moderator individually. However, the overall number of
studies that met the inclusion and study quality criteria was small, and not every study
included information that permitted coding of all moderators. Thus, analyzing all varia-
bles within one regression model was not feasible, as results would be
uninterpretable due to insufficient degrees of freedom. Therefore, the single-predictor
RVE meta-regression models used in the meta-analysis must be interpreted with caution
due to the potential confounding effects of other moderators that are not accounted for
in these models.
Issues in Coding Studies
Only a few studies (Denton et al., 2014; Vadasy, Sanders, & Tudor, 2007) provided a
rich description on the nature of instruction in the intervention. Most articles did not
provide sufficient detail on how reading is taught; instead, they merely listed the areas
of instruction that were covered, provided a brief cursory explanation, or addressed
them in a figure or table with little sense of the amount of time devoted to activities or
what they actually entailed. Because written descriptions were often not detailed enough,
coding or classifying these areas was at times guesswork.
Given the difficulty we had with coding the instructional focus categories, we recom-
mend future intervention research articles to include more detailed descriptions of each
component of the intervention. However, this may be easier said than done as journal
article submission usually entails strict space allocation. Understanding this, we would
encourage authors to write detailed descriptions with sample lessons, place them in a
website noted in the article, and provide access to the information. That would
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 421
allow those involved in research syntheses and those interested in replications to
access this material and ultimately have a better understanding of the nature of
the research.
Coding the at-risk category was a challenging task, as out of the 33 studies, only 16
could be used for the analysis examining the moderating role of the at-risk status variable.
This is because, across the 33 studies, there was little commonality in how at-risk status
was operationally defined (i.e., described as below grade-level performance; based on local
norms, national norms, researcher-developed measures, or validated screening measures),
making comparisons across studies difficult and underpowered. We also found the use of
norms from standardized tests to be problematic because some of the norms were much
older than others, and the field of early literacy instruction has undergone massive changes
in the past 15years. In addition, there are likely to be shifts in national norms on some
measures, especially those involving phonological awareness, phonics, and possibly oral
reading fluency. It would be helpful if the field could adopt more consistent means of
determining the suitable samples for a Tier 2 reading intervention.
Fidelity of implementation was another area that was difficult to code due to the lack
of consistency across studies in how fidelity was explained and measured. For instance,
if different measurement systems are used, 80% fidelity in one study is not comparable
to 80% fidelity in another study. As a result, though this was very much an area of
interest for us, we could not code for fidelity as a moderator.
Implications for Future Research
Most intervention studies examine impacts immediately at the end of an intervention.
An important next step in reading intervention research, one only occasionally
attempted to date (e.g., Al Otaiba, Kim, Wanzek, Petscher, & Wagner, 2014; Blachman
et al., 2014; Vaughn et al., 2008), is to see whether the impacts on reading performance
are maintained, both with and without further intervention, in follow-up studies.
Additional intervention research is also needed in the area of vocabulary. Few studies
in our meta-analysis addressed reading vocabulary in a comprehensive manner during
the intervention, and only two studies (Gunn, Smolkowski, Biglan, Black, & Blair, 2005;
OConnor, Swanson, & Geraghty, 2010) included vocabulary as an outcome measure.
We were therefore unable to draw conclusions on this crucial aspect of reading profi-
ciency. Future intervention research, especially in Grades 2 and 3, should include a sys-
tematic vocabulary instruction component in the interventions and assess its
effectiveness using reading vocabulary outcomes.
We would also encourage more intervention research in the areas of reading and lan-
guage comprehension, since these were areas of weaker impacts. Newer intervention
research (e.g., Foorman, Herrera, & Dombek, 2018) increasingly includes both reading
and listening comprehension, and these may lead to stronger impacts in the reading
comprehension domain.
Acknowledgments
The authors wish to acknowledge the sage advice provided by Nancy Lewis and Terri Pigott, and
recognize Samantha Spallone, Pam Foremski, and Christopher Tran for their assistance.
422 R. GERSTEN ET AL.
Funding
This research was supported in part by Contract Number [ED-IES-12-C-0011]. The views do not
represent those of the U.S. Department of Education.
References
Al Otaiba, S., Connor, C. M., Folsom, J. S., Wanzek, J., Greulich, L., Schatschneider, C., &
Wagner, R. K. (2014). To wait in Tier 1 or intervene immediately: A randomized experiment
examining first-grade response to intervention in reading. Exceptional Children,81(1), 1127.
doi:10.1177/0014402914532234
Al Otaiba, S., & Fuchs, D. (2002). Characteristics of children who are unresponsive to early liter-
acy intervention: A review of the literature. Remedial and Special Education,23(5), 300316.
doi:10.1177/07419325020230050501
Al Otaiba, S., Kim, Y. S., Wanzek, J., Petscher, Y., & Wagner, R. K. (2014). Long-term effects of
first-grade multitier intervention. Journal of Research on Educational Effectiveness,7(3),
250267. doi:10.1080/19345747.2014.906692
Allor, J., & McCathren, R. (2004). The efficacy of an early literacy tutoring program implemented
by college students. Learning Disabilities Research and Practice,19(2), 116129. doi:10.1111/j.
1540-5826.2004.00095.x
Balu, R., Zhu, P., Doolittle, F., Schiller, E., Jenkins, J., & Gersten, R. (2015). Evaluation of response
to intervention practices for elementary school reading (NCEE 2016-4000). Washington, DC:
National Center for Education Evaluation and Regional Assistance, Institute of Education
Sciences, U.S. Department of Education. Retrieved from http://ies.ed.gov/ncee/pubs/20164000/
pdf/20164000.pdf
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and power-
ful approach to multiple testing. Journal of the Royal Statistical Society: Series B
(Methodological),57(1), 289300. doi:10.1111/j.2517-6161.1995.tb02031.x
Berninger, V. W., Abbott, R. D., Vermeulen, K., & Fulton, C. M. (2006). Paths to reading com-
prehension in at-risk second-grade readers. Journal of Learning Disabilities,39(4), 334351.
doi:10.1177/00222194060390040701
Blachman, B. A., Schatschneider, C., Fletcher, J. M., Francis, D. J., Clonan, S. M., Shaywitz, B. A.,
& Shaywitz, S. E. (2004). Effects of intensive reading remediation for second and third graders
and a 1-year follow-up. Journal of Educational Psychology,96 (3), 444461. doi:10.1037/0022-
0663.96.3.444
Blachman, B. A., Schatschneider, C., Fletcher, J. M., Murray, M. S., Munger, K. A., & Vaughn,
M. G. (2014). Intensive reading remediation in grade 2 or 3: Are there effects a decade later?
Journal of Educational Psychology,106 (1), 4657. doi:10.1037/a0033663
Case, L. P., Speece, D. L., Silverman, R., Ritchey, K. D., Schatschneider, C., Cooper, D. H.,
Jacobs, D. (2010). Validation of a supplemental reading intervention for first-grade children.
Journal of Learning Disabilities,43(5), 402417. doi:10.1177/0022219409355475
Case, L., Speece, D., Silverman, R., Schatschneider, C., Montanaro, E., & Ritchey, K. (2014).
Immediate and long-term effects of tier 2 reading instruction for first-grade students with a
high probability of reading failure. Journal of Research on Educational Effectiveness,7(1),
2853. doi:10.1080/19345747.2013.786771
Center for Research and Reform in Education & Johns Hopkins University. (2019). Evidence for
ESSA: Standards and procedures. Retrieved from https://content.evidenceforessa.org/sites/
default/files/On%20clean%20Word%20doc.pdf
Denton, C. A., Fletcher, J. M., Taylor, W. P., Barth, A. E., & Vaughn, S. (2014). An experimental
evaluation of guided reading and explicit interventions for primary-grade students at-risk for
reading difficulties. Journal of Research on Educational Effectiveness,7(3), 268293. doi:10.1080/
19345747.2014.906010
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 423
Denton, C. A., Nimon, K., Mathes, P. G., Swanson, E. A., Kethley, C., Kurz, T. B., & Shih, M.
(2010). Effectiveness of a supplemental early reading intervention scaled up in multiple schools.
Exceptional Children,76 (4), 394416. doi:10.1177/001440291007600402
Denton, C. A., Tolar, T. D., Fletcher, J. M., Barth, A. E., Vaughn, S., & Francis, D. J. (2013).
Effects of tier 3 intervention for students with persistent reading difficulties and characteristics
of inadequate responders. Journal of Educational Psychology,105(3), 633648. doi:10.1037/
a0032581
Duval, S., & Tweedie, R. (2000). Trim and fill: A simple funnel-plot-based method of testing and
adjusting for publication bias in meta-analysis. Biometrics,56 (2), 455463. doi:10.1111/j.0006-
341X.2000.00455.x
Elleman, A. M., Lindo, E. J., Morphy, P., & Compton, D. L. (2009). The impact of vocabulary
instruction on passage-level comprehension of school-age children: A meta-analysis. Journal of
Research on Educational Effectiveness,2(1), 144. doi:10.1080/19345740802539200
Every Student Succeeds Act of 2015, Pub. L. No. 11495, § 8101(21)(A), 129 Stat.1939 (2015).
Fang, Z., Fu, D., & Lamme, L. L. (2004). From scripted instruction to teacher empowerment:
Supporting literacy teachers to make pedagogical transitions. Literacy (Formerly Reading),
38(1), 5864. doi:10.1111/j.0034-0472.2004.03801010.x
Fien, H., Smith, J. L. M., Smolkowski, K., Baker, S. K., Nelson, N. J., & Chaparro, E. (2015). An
examination of the efficacy of a multitiered intervention on early reading outcomes for first
grade students at risk for reading difficulties. Journal of Learning Disabilities,48(6), 602621.
doi:10.1177/0022219414521664
Foorman, B., Beyler, N., Borradaile, K., Coyne, M., Denton, C. A., Dimino, J., Wissel, S.
(2016). Foundational skills to support reading for understanding in kindergarten through 3rd
grade (NCEE 2016-4008). Washington, DC: National Center for Education Evaluation and
Regional Assistance (NCEE), Institute of Education Sciences, U.S. Department of Education.
Retrieved from https://ies.ed.gov/ncee/wwc/practiceguide/21
Foorman, B. R., Herrera, S., & Dombek, J. (2018). The relative impact of aligning Tier 2 interven-
tion materials with classroom core reading materials in grades K2. The Elementary School
Journal,118(3), 477504. doi:10.1086/696021
Francis, D. J., Kulesz, P. A., & Benoit, J. S. (2018). Extending the simple view of reading to
account for variation within readers and across texts: The complete view of reading (CVR i).
Remedial and Special Education,39(5), 274288. doi:10.1177/0741932518772904
Fuchs, D., Compton, D. L., Fuchs, L. S., Bryant, J., & Davis, G. N. (2008). Making secondary
interventionwork in a three-tier responsiveness-to-intervention model: Findings from the
first-grade longitudinal reading study of the National Research Center on Learning Disabilities.
Reading and Writing,21(4), 413436. doi:10.1007/s11145-007-9083-9
Fuchs, D., & Fuchs, L. S. (2017). Critique of the national evaluation of response to intervention:
A case for simpler frameworks. Exceptional Children,83(3), 255268. doi:10.1177/
0014402917693580
Fuchs, L. S., Fuchs, D., Powell, S. R., Seethaler, P. M., Cirino, P. T., & Fletcher, J. M. (2008).
Intensive intervention for students with math disabilities: Seven principles of effective practice.
Learning Disability Quarterly,31(2), 7992. doi:10.2307/20528819
Gamse, B. C., Jacob, R. T., Horst, M., Boulay, B., & Unlu, F. (2008). Reading first impact study
final report (NCEE 2009-4038). Washington, DC: National Center for Education Evaluation
and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.
Gersten, R., Chard, D., Jayanthi, M., Baker, S., Morphy, P., & Flojo, J. (2009). Mathematics
instruction for students with learning disabilities: A meta-analysis of instructional components.
Review of Educational Research,79(3), 12021242. doi:10.3102/0034654309334431
Gersten, R., Compton, D., Connor, C. M., Dimino, J., Santoro, L., Linan-Thompson, S., & Tilly,
W. D. (2009). Assisting students struggling with reading: Response to Intervention and multi-tier
intervention for reading in the primary grades. A practice guide (NCEE 2009-4045).
Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute
of Education Sciences, U.S. Department of Education. Retrieved from http://ies.ed.gov/ncee/
wwc/pdf/practice_guides/rti_reading_pg_021809.pdf
424 R. GERSTEN ET AL.
Gersten, R., Jayanthi, M., & Dimino, J. (2017). Too much, too soon? A commentary on what the
national RtI evaluation left unanswered and what reading intervention research tells us.
Exceptional Children,83(3), 244254. doi:10.1177/0014402917692847
Greco, T., Zangrillo, A., Biondi-Zoccai, G., & Landoni, G. (2013). Meta-analysis: Pitfalls and
hints. Heart, Lung and Vessels,5(4), 219225.
Gunn, B., Smolkowski, K., Biglan, A., Black, C., & Blair, J. (2005). Fostering the development of
reading skill through supplemental instruction results for Hispanic and non-Hispanic students.
The Journal of Special Education,39(2), 6685. doi:10.1177/00224669050390020301
Hedberg, E. C. (2011). ROBUMETA: Stata module to perform robust variance estimation in meta-
regression with dependent effect size estimates. Boston, MA: Boston College.
Hedges, L. V. (1981). Distribution theory for Glasss estimator of effect size and related estima-
tors. Journal of Educational Statistics,6(2), 107128. doi:10.3102/10769986006002107
Hedges, L. V., Tipton, E., & Johnson, M. C. (2010). Robust variance estimation in meta-regres-
sion with dependent effect size estimates. Research Synthesis Methods,1(1), 3965. doi:10.1002/
jrsm.5
Imbens, G. W., & Lemieux, T. (2008). Regression discontinuity designs: A guide to practice.
Journal of Econometrics,142(2), 615635. doi:10.1016/j.jeconom.2007.05.001
Individuals with Disabilities Education Act, Pub. L. No. 108-446, 20 U.S.C. § 1400, 118 Stat. 2649 (2004).
Ioannidis, J. (2005). Why most published research findings are false. PLoS Medicine,2(8), e124.
doi:10.1371/journal.pmed.0020124
Jacob, R., Armstrong, C., Bowden, A. B., & Pan, Y. (2016). Leveraging volunteers: An experimen-
tal evaluation of a tutoring program for struggling readers. Journal of Research on Educational
Effectiveness,9(Supp 1), 6792. doi:10.1080/19345747.2016.1138560
Jenkins, J. R., Peyton, J. A., Sanders, E. A., & Vadasy, P. F. (2004). Effects of reading decodable
texts in supplemental first-grade tutoring. Scientific Studies of Reading,8(1), 5385. doi:10.
1207/s1532799xssr0801_4
Lane, H. B., Pullen, P. C., Hudson, R. F., & Konold, T. R. (2009). Identifying essential instruc-
tional components of literacy tutoring for struggling beginning readers. Literacy Research and
Instruction,48(4), 277297. doi:10.1080/19388070902875173
L!
opez-L!
opez, J. A., Van den Noortgate, W., Tanner-Smith, E. E., Wilson, S. J., & Lipsey, M. W.
(2017). Assessing meta-regression methods for examining moderator relationships with
dependent effect sizes: A Monte Carlo simulation. Research Synthesis Methods,8(4), 435450.
doi:10.1002/jrsm.1245
May, H., Sirinides, P., Gray, A., & Goldsworthy, H. (2016). Reading recovery: An evaluation of the
four-year i3 scale-up. Philadelphia, PA: Consortium for Policy Research in Education,
University of Pennsylvania.
National Institute of Child Health and Human Development [NICHD]. (2000). Report of the
National Reading Panel. Teaching children to read: Reports of the subgroups (NIH Publication
No. 00-4754). Washington, DC: U.S. Department of Health and Human Services. Retrieved
from https://www.nichd.nih.gov/sites/default/files/publications/pubs/nrp/Documents/report.pdf
No Child Left Behind Act of 2001 [NCLB], Pub. L. No. 107-110, § 1201, 115 Stat. 1425 (2002).
OConnor, R. E., Swanson, H. L., & Geraghty, C. (2010). Improvement in reading rate under
independent and difficult text levels: Influences on word and comprehension skills. Journal of
Educational Psychology,102(1), 119. doi:10.1037/a0017488
Pullen, P. C., Lane, H. B., & Monaghan, M. C. (2004). Effects of a volunteer tutoring model on
the early literacy development of struggling first grade students. Reading Research and
Instruction,43(4), 2140. doi:10.1080/19388070409558415
Scanlon, D. M., Vellutino, F. R., Small, S. G., Fanuele, D. P., & Sweeney, J. M. (2005). Severe
reading difficultiesCan they be prevented? A comparison of prevention and intervention
approaches. Exceptionality,13(4), 209227. doi:10.1207/s15327035ex1304_3
Schwartz, R. M. (2005). Literacy learning of at-risk first-grade students in the reading recovery
early intervention. Journal of Educational Psychology,97(2), 257267. doi:10.1037/0022-0663.97.
2.257
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 425
Slavin, R. E., Lake, C., Davis, S., & Madden, N. A. (2011). Effective programs for struggling read-
ers: A best-evidence synthesis. Educational Research Review,6(1), 126. doi:10.1016/j.edurev.
2010.07.002
Smith, J. L. M., Nelson, N. J., Fien, H., Smolkowski, K., Kosty, D., & Baker, S. K. (2016).
Examining the efficacy of a multitiered intervention for at-risk readers in grade 1. The
Elementary School Journal,116 (4), 549573. doi:10.1086/686249
StataCorp. (2015). Stata statistical software (Release 14). College Station, TX: StataCorp LP.
Stuebing, K. K., Barth, A. E., Trahan, L. H., Reddy, R. R., Miciak, J., & Fletcher, J. M. (2015). Are
child cognitive characteristics strong predictors of responses to intervention? A meta-analysis.
Review of Educational Research,85(3), 395429. doi:10.3102/0034654314555996
Swanson, H. L. (1999). Reading research for students with LD: A meta-analysis of intervention
outcomes. Journal of Learning Disabilities,32(6), 504532. doi:10.1177/002221949903200605
Tanner-Smith, E. E., & Tipton, E. (2014). Robust variance estimation with dependent effect sizes:
Practical considerations including a software tutorial in Stata and SPSS. Research Synthesis
Methods,5(1), 1330. doi:10.1002/jrsm.1091
Tipton, E. (2015). Small sample adjustments for robust variance estimation with meta-regression.
Psychological Methods,20(3), 375393. doi:10.1037/met0000011
Tivnan, T., & Hemphill, L. (2005). Comparing four literacy reform models in high-poverty
schools: Patterns of first-grade achievement. The Elementary School Journal,105(5), 419441.
doi:10.1086/431885
Tran, L., Sanchez, T., Arellano, B., & Swanson, H. L. (2011). A meta-analysis of the RTI literature
for children at risk for reading disabilities. Journal of Learning Disabilities,44(3), 283295. doi:
10.1177/0022219410378447
U.S. Department of Education [U.S. ED], Institute of Education Sciences [IES], & What Works
Clearinghouse [WWC]. (2013). What Works Clearinghouse: Procedures and standards handbook
(Version 3.0). Retrieved from https://ies.ed.gov/ncee/wwc/Docs/referenceresources/wwc_proce-
dures_v3_0_standards_handbook.pdf
Vadasy, P. F., & Sanders, E. A. (2011). Efficacy of supplemental phonics-based instruction for
low-skilled first graders: How language minority status and pretest characteristics moderate
treatment response. Scientific Studies of Reading,15(6), 471497. doi:10.1080/10888438.2010.
501091
Vadasy, P. F., Sanders, E. A., & Peyton, J. A. (2006). Paraeducator-supplemented instruction in
structural analysis with text reading practice for second and third graders at risk for reading
problems. Remedial and Special Education,27(6), 365378. doi:10.1177/07419325060270060601
Vadasy, P. F., Sanders, E. A., & Tudor, S. (2007). Effectiveness of paraeducator-supplemented
individual instruction: Beyond basic decoding skills. Journal of Learning Disabilities,40(6),
508525. doi:10.1177/00222194070400060301
Vaughn, S., Cirino, P. T., Tolar, T., Fletcher, J. M., Cardenas-Hagan, E., Carlson, C. D., &
Francis, D. J. (2008). Long-term follow-up of Spanish and English interventions for first-grade
English language learners at risk for reading problems. Journal of Research on Educational
Effectiveness,1(3), 179214. doi:10.1080/19345740802114749
Vellutino, F. R., & Scanlon, D. M. (2002). The Interactive Strategies approach to reading interven-
tion. Contemporary Educational Psychology,27(4), 573635. doi:10.1016/S0361-476X(02)00002-4
Wang, C., & Algozzine, B. (2008). Effects of targeted intervention on early literacy skills of at-risk
students. Journal of Research in Childhood Education,22(4), 425439. doi:10.1080/
02568540809594637
Wanzek, J., Stevens, E. A., Williams, K. J., Scammacca, N., Vaughn, S., & Sargent, K. (2018).
Current evidence on the effects of intensive early reading interventions. Journal of Learning
Disabilities,51(6), 612624. doi:10.1177/0022219418775110
Wanzek, J., & Vaughn, S. (2007). Research-based implications from extensive early reading inter-
ventions. School Psychology Review,36 (4), 541561.
Wanzek, J., & Vaughn, S. (2008). Response to varying amounts of time in reading intervention
for students with low response to intervention. Journal of Learning Disabilities,41(2), 126142.
doi:10.1177/0022219407313426
426 R. GERSTEN ET AL.
Wanzek, J., Vaughn, S., Scammacca, N., Gatlin, B., Walker, M. A., & Capin, P. (2016). Meta-anal-
yses of the effects of tier 2 type reading interventions in grades K-3. Educational Psychology
Review,28(3), 551576. doi:10.1007/s10648-015-9321-7
Wilson, P., Martens, P., & Arya, P. (2005). Accountability for reading and readers: What the
numbers dont tell. The Reading Teacher,58(7), 622631. doi:10.1598/RT.58.7.3
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 427
... As many metaanalyses contributed to examine the moderator effects of GO by intervention dosage, intervention dosage was considered as one of strong moderators that could lead significantly different effect sizes (Liu et al., 2014;Schroeder et al., 2018). Typically, researchers assumed that higher intervention dosage (higher number of intervention sessions, higher frequency of intervention, and smaller group sizes) could learners be more exposed to contents sufficiently, ultimately linking to their achievements (Donegan & Wanzek, 2021;Gersten et al., 2020;Jung et al., 2017;Wanzek & Vaughn, 2007). For example, Schroeder et al. (2018) explained that cognitive load did decrease when processing the formats of GO as learners are familiar with the formats and they become more likely to focus on the learning contents and materials. ...
... Intervention dosage includes sessions, duration, and frequency. We considered prior research on intervention dosage when categorizing each intervention dosage variable (Donegan & Wanzek, 2021;Gersten et al., 2020;Jung et al., 2017;Wanzek & Vaughn, 2007). Total number of intervention sessions includes one to nineteen and more than nineteen sessions. ...
... While several variables exhibited significant subgroup differences, only group size and frequency were identified as significant moderators after controlling for other factors. These results are supported by previous literature and research, which revealed a correlation between higher intensive intervention dosage and improved student outcomes (Donegan & Wanzek, 2021;Gersten et al., 2020;Jung et al., 2017;Wanzek & Vaughn, 2007). However, the results of the meta-regression analysis could suggest that, if practitioners need to prioritize types of intervention dosage, they should consider organizing small group size instruction with more frequent exposure to interventions using GO. ...
Article
Full-text available
This study aimed to evaluate the effects of intervention using graphic organizers on the cognitive and affective improvement of students with intellectual disability (ID), with learning disability (LD), without disability, and at-risk learners in Korea. A total of 49 peer-reviewed journals and dissertations for the last 20 years were included for conducting this meta-analysis. The overall effect size of intervention using a graphic organizer was .78 (d) (95% CI [.63, .94], \({\tau }^{2}\) = .28) using a random-effects model. In order of strongest to weakest effects, at-risk learners (d = 1.38), students with LD (d = 1.15), students with ID (d = .76), and students without disability (d = .52). Among student variables, there is no statistically significant difference by school level, but by school type. Among intervention variables, instruction in math (d = 1.43) and Korean (d = .96); cognitive mapping (d = 1.05); 1–19 times, 1–9 weeks, 3–5 times per week; and small-size groups were the most effective intervention conditions. While several variables showed significant subgroup differences, meta-regression analyses revealed that only group size and frequency were significant moderators after controlling for other factors. In summary, intervention using a graphic organizer was more effective for students with disability and at-risk students than it was for students without disability.
... These findings suggest it was feasible for teachers to provide small group instruction, which is an important dimension of individualized and intensive intervention (e.g., Al Otaiba et al., 2022;Gersten et al., 2009;Hall et al., 2022). Prior research (though provided in Tier 2 settings) has demonstrated that the effectiveness of interventions may be even stronger in one-to-one versus small group settings in the early grades, particularly in primary grades (e.g., Gersten et al., 2020;Vaughn et al., 2010). While we observed more individualized interventions during Tier 3 than Tier 1, one-to-one intervention occurred for only 7% of students who received Tier 3. High caseloads and other resource demands may limit teachers' ability to provide more one-to-one intervention. ...
Article
Full-text available
There is limited research about Tier 3 interventions provided during typical school Response to Intervention (RTI) implementation. As part of a larger RTI exploration study designed to focus on students with the most intensive reading needs, our goal was to contrast their Tier 1 core reading instruction with their Tier 3 intervention. Schools identified participating students as receiving Tier 3 or special education for reading. We aimed to provide a snapshot of differences in Tier 1 and intensive interventions delivered to 264 students from 32 elementary schools. Our research team used the Instructional Content Emphasis-Revised (ICE-R; Edmond & Briggs, 2003) to observe differences in group size and the amounts and types of curricular content (e.g., code vs. meaning-focused) students received during both tiers. We explored whether these patterns were consistent across Grades 1–5 and if they differed in relation to students’ disability (reading disability vs. other disability vs. no disability). We also examined whether the Tier 3 observation data was consistent with administrators’ reports about RTI implementation within their school. Across the grades, we found significantly more small-group than whole-class instruction during Tier 3 than in Tier 1. We observed significantly lower proportions of code-focused instruction than meaning-focused instruction, particularly during Tier 1. Generally, code-focused instruction decreased across the grades in both tiers. Although we found a trend suggesting students with reading disabilities may have received higher proportions of code-focused instruction than other students, this was not significant after multiple comparisons. We discuss some similarities and differences between our observations and administrator data. We discuss implications, limitations, and directions for future research.
... In numerous countries, including Indonesia, the shift from "learning to read" to "reading to learn" is critical in primary education. Several research findings suggest that students with robust reading literacy skills will likely excel academically across all disciplines, not just in their language studies (Gersten et al., 2020;Hayat & Yusuf, 2010;McConachie & Petrosky, 2009;Perin, 2013;Murnane et al., 2012;Barrow & Markman, 2016). This highlights the significance of reading literacy as a fundamental skill and a cornerstone for lifelong learning and interdisciplinary comprehension. ...
Article
Full-text available
This study aims to measure the reading literacy level of grade IV elementary school students (abbreviated as SD) in East Lombok with reading literacy parameters PIRLS 2011. The study aims at three aspects of reading: reading content comprehension, reading speed, and reading effectiveness. This study is expected to provide information about the position of literacy competence of grade IV elementary school students in East Lombok based upon PIRLS parameters and the formulation of recommendations for improving the reading literacy competence of elementary school students in East Lombok. Data was collected through reading tests, observations, and interviews. The study results show that the ability to understand the reading content of the study subjects of grade IV elementary school students in East Lombok was included in the low category, 0.51%, when compared to IEA countries between 15%, 19% and 24%. Reading speed is moderate, 151.1 words/minute compared to 130-180 words/minute on the national average. Reading effectiveness was low, at 34.9%, compared to 60-80 subjects in IEA member countries based on PIRLS standards. The ability to read in aspects of reading content comprehension at a low level, reading speed at a moderate level, and reading effectiveness at a low level is caused by external elements; the understanding of teachers, parents, society and government that reading is the ability to string words in a broader grammar. Internally, students are encouraged to read fluently to obtain factual information and the number of words read in a certain period.
... The aim of this meta-analysis is to examine the research evidence base for the use of decodable texts in the teaching of reading, and to determine the effectiveness of this approach. Several meta-analyses have been conducted to examine the impact of using decodable texts with at-risk and reading-disabled learners (Gersten et al., 2020;Pugh et al., 2023;Wanzek et al., 2016Wanzek et al., , 2018. However, no existing meta-analytic investigations explore the effects of using decodable texts with beginning readers who are neither at risk for or identified as having reading disabilities. ...
Article
Full-text available
The purpose of this meta-analysis is to synthesise the research evidence on the use of decodable texts in the teaching of word reading and pseudoword decoding to determine their effectiveness in facilitating the development of reading skills in children without reading disabilities. A total of 821 articles were identified in the initial search. The search resulted in 16 articles that met the inclusion criteria and were included in the meta-analyses. The results of the risk of bias assessment revealed that the majority of the studies had a moderate to serious bias. The average standardised mean difference for word reading was small g = 0.20 and moderate g = 0.30 for pseudoword decoding. This finding highlights how using decodable texts can facilitate word reading and decoding to some degree, but they need to be used in combination with other reading instructional materials.
... Diversos trabajos avalan lo anterior. Gersten et al. (2020), por ejemplo, describen en su metaanálisis que las intervenciones que se aplicaron de manera individualizada demostraron alcanzar resultados mayores al aplicarlas con alumnado escolarizado en el primer grado de la enseñanza elemental. Reducir los agrupamientos también parece haber recibido apoyo en estudios que trabajaban con niños con dificultades persistentes (Denton et al., 2013), así como en trabajos que buscaban de forma expresa manipular dicha variable (Ehri, Dreyer, Flugman y Gross, 2007). ...
Article
Pese a que la investigación ha demostrado que es posible mejorar la decodificación de los niños con dislexia, pocos estudios han dado cuenta de ello en entornos reales en los que los recursos son limitados. En este estudió se evaluó, mediante el desarrollo de un diseño experimental de caso único con línea base múltiple, la aplicación del programa IDECOL en tres niños en riesgo de dislexia que mostraban dificultades severas para decodificar. La intervención abordaba los siguientes aspectos clave para la mejora de la decodificación: a) instrucción directa del principio alfabético, b) decodificación, c) conciencia fonémica d) ortografía frecuente y e) lecturas repetidas. Los niños recibieron una sesión de 55 minutos a la semana durante 12 semanas. Tras el diseño de una línea base en la que no se aplicó la intervención se fue monitorizando de manera repetida la mejora de los niños en la lectura. La mejora en la decodificación de los tres niños fue inmediata tras la aplicación de la primera sesión y el efecto encontrado fue amplio transcurridas las doce sesiones. Tras la retirada de la intervención dos de los tres niños sufrieron un retroceso o se estancaron en la mejora de sus habilidades de decodificación, lo que apoya que los efectos encontrados se debían a la aplicación del programa. Los resultados avalan que una intervención individualizada y ampliamente especializada aplicada en contextos reales puede ser efectiva con una exposición relativamente baja en niños pequeños que muestran dificultades relevantes para decodificar.
Article
Full-text available
Theoretical background While reading and spelling skills often are interconnected in models of literacy development, recent research suggests that the two skills can dissociate and that reading and spelling are associated with at least partly different cognitive predictors. However, previous research on dissociations between reading and spelling skills focused on children who have already mastered the first phases of literacy development. These findings suggest that dissociations are due to distinct deficits in orthographic processing (i.e., unprecise orthographic representations vs. inefficient serial processing). It is therefore unclear whether dissociations already become apparent during the initial stages, or rather emerge later in development. This study aims to enhance the understanding of the predictors of early spelling and reading skills, investigating potential variations, by considering various cognitive factors beyond well-established ones. Methods Data were collected at two time points: cognitive predictors and early reading and spelling skills were assessed at the end of kindergarten (T1) before formal literacy instruction started, and reading and spelling skills were again assessed in Grade 1 (T2). The data analysis included 353 first-grade participants. Linear regression analyses assessed predictive patterns, while logistic regression analyses explained children's likelihood of belonging to different proficiency groups (at-risk or typical skills). Results Results revealed phonological processing, letter knowledge, and intelligence, as significant predictors for Spelling in grade 1 (T2), even after adding the autoregressor (Spelling in kindergarten at T1) and the respective other literacy skill (Reading T2). For Reading in grade 1 (T2), phonological processing, and rapid automatized naming (RAN) surfaced as significant predictors after adding the autoregressor (Reading T1). However, only RAN surfaced as a significant predictor for Reading T2 after adding the respective other literacy skill (Spelling T2). In line with these findings, logistic regression analyses revealed that phonological processing predicted group allocation for Spelling T2 and RAN predicted group allocation for Reading T2. Conclusions Overall, the study underscores the importance of phonological processing and letter knowledge as early predictors of spelling and reading skills in Grade 1. Moreover, intelligence is identified as a predictor for early spelling, while rapid automatized naming (RAN) emerges as a predictor for early reading.
Article
Full-text available
The report of the national response to intervention (RTI) evaluation study, conducted during 2011–2012, was released in November 2015. Anyone who has read the lengthy report can attest to its complexity and the design used in the study. Both these factors can influence the interpretation of the results from this evaluation. In this commentary, we (a) explain what the national RTI evaluation examined and highlight the strengths and weaknesses of the design, (b) clarify the results of the evaluation and highlight some key implementation issues, (c) describe how rigorous efficacy trials on reading interventions can supplement several issues left unanswered by the national evaluation, and (d) discuss implications for future research and practice based on the findings of the national evaluation and reading intervention research.
Article
The common approach to the multiplicity problem calls for controlling the familywise error rate (FWER). This approach, though, has faults, and we point out a few. A different approach to problems of multiple significance testing is presented. It calls for controlling the expected proportion of falsely rejected hypotheses — the false discovery rate. This error rate is equivalent to the FWER when all hypotheses are true but is smaller otherwise. Therefore, in problems where the control of the false discovery rate rather than that of the FWER is desired, there is potential for a gain in power. A simple sequential Bonferronitype procedure is proved to control the false discovery rate for independent test statistics, and a simulation study shows that the gain in power is substantial. The use of the new procedure and the appropriateness of the criterion are illustrated with examples.
Article
This study leverages advances in multivariate cross-classified random effects models to extend the Simple View of Reading to account for variation within readers and across texts, allowing for both the personalization of the reading function and the integration of the component skills and text and discourse frameworks for reading research. We illustrate the Complete View of Reading (CVRi) using data from an intensive longitudinal design study with a large sample of typical (N = 648) and struggling readers (N = 865) in middle school and using oral reading fluency as a proxy for comprehension. To illustrate the utility of the CVRi, we present a model with cross-classified random intercepts for students and passages and random slopes for growth, Lexile difficulty, and expository text type at the student level. We highlight differences between typical and struggling readers and differences across students in different grades. The model illustrates that readers develop differently and approach the reading task differently, showing differential impact of text features on their fluency. To be complete, a model of reading must be able to reflect this heterogeneity at the person and passage level, and the CVRi is a step in that direction. Implications for reading interventions and 21st century reading research in the era of “Big Data” and interest in phenotypic characterization are discussed.
Article
Many students at risk for or identified with reading disabilities need intensive reading interventions. This meta-analysis provides an update to the Wanzek and Vaughn synthesis on intensive early reading interventions. Effects from 25 reading intervention studies are analyzed to examine the overall effect of intensive early reading interventions as well as relationships between intervention and student characteristics related to outcomes. The weighted mean effect size (ES) estimate (ES = 0.39), with a mean effect size adjusted for publication bias (ES = 0.28), both significantly different from zero, suggested intensive early reading interventions resulted in positive outcomes for early struggling readers in kindergarten through third grades. There was no statistically significant or meaningful heterogeneity in the study-wise effect sizes. Exploratory examination of time in intervention, instructional group size, initial reading achievement, and date of publication are provided.
Article
This randomized controlled trial in 55 low-performing schools across Florida compared 2 early literacy interventions—1 using stand-alone materials and 1 using materials embedded in the existing core reading/language arts program. A total of 3,447 students who were below the 30th percentile in vocabulary and reading-related skills participated in the study. Both interventions were implemented with fidelity for 45 minutes daily for 27 weeks in small groups of 4 students (or 5 in grade 2). The stand-alone intervention significantly improved grade 2 spelling outcomes relative to the embedded intervention; there were some differential impacts due to cohort and baseline and, in kindergarten, to English-learner status. On average, students in schools in both interventions showed similar improvement in reading and language outcomes and similar percentile gains to those in recent systematic reviews. Results are discussed with respect to alignment of Tier 2 instruction with Tier 1 instruction.
Article
In 2010, the Institute of Education Sciences commissioned a much-needed national evaluation of response to intervention (RTI). The evaluators defined their task very narrowly, asking “Does the use of universal screening, including a cut-point for designating students for more intensive Tier 2 and Tier 3 interventions, increase children’s performance on a comprehensive reading measure?” Their regression-discontinuity analysis showed that first-grade children designated for (but not necessarily receiving) more intensive intervention in the 146 study schools performed significantly worse than children not designated for it. There were no reliable differences between designated and nondesignated students in Grades 2 or 3. The provocativeness of these findings notwithstanding, the evaluation’s focus and design weakens its importance. RTI implementation data were also collected in the 146 study schools. These data suggest many of them were not conducting RTI in a manner supported by research and policy. Such findings and others’ evaluations of RTI advance the idea that simpler frameworks may encourage more educators to implement RTI’s most important components with fidelity.
Article
Dependent effect sizes are ubiquitous in meta-analysis. Using Monte Carlo simulation, we compared the performance of 2 methods for meta-regression with dependent effect sizes-robust variance estimation (RVE) and 3-level modeling-with the standard meta-analytic method for independent effect sizes. We further compared bias-reduced linearization and jackknife estimators as small-sample adjustments for RVE and Wald-type and likelihood ratio tests for 3-level models. The bias in the slope estimates, width of the confidence intervals around those estimates, and empirical type I error and statistical power rates of the hypothesis tests from these different methods were compared for mixed-effects meta-regression analysis with one moderator either at the study or at the effect size level. All methods yielded nearly unbiased slope estimates under most scenarios, but as expected, the standard method ignoring dependency provided inflated type I error rates when testing the significance of the moderators. Robust variance estimation methods yielded not only the best results in terms of type I error rate but also the widest confidence intervals and the lowest power rates, especially when using the jackknife adjustments. Three-level models showed a promising performance with a moderate to large number of studies, especially with the likelihood ratio test, and yielded narrower confidence intervals around the slope and higher power rates than those obtained with the RVE approach. All methods performed better when the moderator was at the effect size level, the number of studies was moderate to large, and the between-studies variance was small. Our results can help meta-analysts deal with dependency in their data.
Article
CPRE released its evaluation of one of the most ambitious and well-documented expansions of a U.S. instructional curriculum. The rigorous independent evaluation of the Investing in Innovation (i3) scale-up of Reading Recovery, a literacy intervention for struggling first graders, was a collaboration between CPRE and the Center for Research on Education and Social Policy (CRESP) at the University of Delaware. The CPRE/CRESP evaluation revealed that students who participated in Reading Recovery significantly outperformed students in the control group on measures of overall reading, reading comprehension, and decoding. These effects were similarly large for English language learners and students attending rural schools, which were the student subgroups of priority interest for the i3 scale-up grant program. The study included an in-depth analysis of program implementation. Key findings focus on the contextual factors of the school and teachers that support the program’s success and the components of instructional strength in Reading Recovery.
Article
This study reports the results of a cluster RCT evaluating the impact of Enhanced Core Reading Instruction on reading achievement of grade 1 at-risk readers. Fortyfour elementary schools, blocked by district, were randomly assigned to condition. In both conditions, at-risk readers received 90 minutes of whole-group instruction (Tier 1) plus an additional 30 minutes of daily, smallgroup intervention (Tier 2). In the treatment condition, Tier 1 instruction included enhancements to the core program and Tier 2 intervention was highly aligned with the core program. In the comparison condition, Tier 1 instruction used the same core program as treatment schools in the district and Tier 2 intervention followed standard district protocol. Significant treatment effects were found on measures of phonemic decoding and oral reading fluency from fall to winter and word reading from fall to spring. Student-and classroom-level variables predicted student response to instruction differentially by condition.
Article
This study evaluates the impacts and costs of the Reading Partners program, which uses community volunteers to provide one-on-one tutoring to struggling readers in under-resourced elementary schools. The evaluation uses an experimental design. Students were randomly assigned within 19 different Reading Partners sites to a program or control condition to answer questions about the impact of the program on student reading proficiency. A cost study, using a subsample of six of the 19 study sites, explores the resources needed to implement the Reading Partners program as described in the evaluation. Findings indicate that the Reading Partners program has a positive and statistically significant impact on all three measures of reading proficiency assessed with an effect size equal to around 0.10. The cost study findings illustrate the potential value of the Reading Partners program from the schools' perspective because the financial and other resources required by the schools to implement the program are low. Additionally, the study serves as an example of how evaluations can rigorously examine both the impacts and costs of a program to provide evidence regarding effectiveness.