ArticlePDF Available

Meta-Analysis of the Impact of Reading Interventions for Students in the Primary Grades

January 2020
Journal of Research on Educational Effectiveness 13(4):1-27

January 2020
13(4):1-27

DOI:10.1080/19345747.2019.1689591

Authors:

Russell Monroe Gersten

Instructional Research Group

Kelly Haymond

Claremont Graduate University

Joseph Dimino

Instructional Research Group

Show all 5 authorsHide

This meta-analysis systematically reviewed the most up-to-date literature to determine the effectiveness of reading interventions on measures of word and pseudoword reading, reading comprehension, and passage fluency, and to determine the role intervention and study variables play in moderating the impacts for students at risk for reading difficulties in Grades 1–3. We used random-effects meta-regression models with robust variance estimates to summarize overall effects and to explore potential moderator effects. Results from a total of 33 rigorous experimental and quasi-experimental studies conducted between 2002 and 2017 that met WWC evidence standards revealed a significant positive effect for reading interventions on reading outcomes, with a mean effect size of 0.39 (SE = .04, p < .001, 95% CI [0.32, 0.46]). Moderator analyses demonstrated that mean effects varied across outcome domains and areas of instruction.

Outcomes and effect sizes.

…

Continued.

…

Figures - uploaded by Russell Monroe Gersten

Content may be subject to copyright.

Content uploaded by Russell Monroe Gersten

Content may be subject to copyright.

Full Terms & Conditions of access and use can be found at
https://www.tandfonline.com/action/journalInformation?journalCode=uree20
Journal of Research on Educational Effectiveness
ISSN: 1934-5747 (Print) 1934-5739 (Online) Journal homepage: https://www.tandfonline.com/loi/uree20
Meta-Analysis of the Impact of Reading
Interventions for Students in the Primary Grades
Russell Gersten, Kelly Haymond, Rebecca Newman-Gonchar, Joseph Dimino
& Madhavi Jayanthi
To cite this article: Russell Gersten, Kelly Haymond, Rebecca Newman-Gonchar, Joseph Dimino
& Madhavi Jayanthi (2020) Meta-Analysis of the Impact of Reading Interventions for Students
in the Primary Grades, Journal of Research on Educational Effectiveness, 13:2, 401-427, DOI:
10.1080/19345747.2019.1689591
To link to this article:  https://doi.org/10.1080/19345747.2019.1689591
Published online: 09 Jan 2020.
Submit your article to this journal 
Article views: 148
View related articles 
View Crossmark data

THEORY, CONTEXT, AND MECHANISMS

Meta-Analysis of the Impact of Reading Interventions for

Students in the Primary Grades

Russell Gersten

, Kelly Haymond

, Rebecca Newman-Gonchar

, Joseph Dimino

and Madhavi Jayanthi

ABSTRACT

This meta-analysis systematically reviewed the most up-to-date litera-

ture to determine the effectiveness of reading interventions on meas-

ures of word and pseudoword reading, reading comprehension, and

passage fluency, and to determine the role intervention and study vari-

ables play in moderating the impacts for students at risk for reading

difficulties in Grades 1–3. We used random-effects meta-regression

models with robust variance estimates to summarize overall effects

and to explore potential moderator effects. Results from a total of 33

rigorous experimental and quasi-experimental studies conducted

between 2002 and 2017 that met WWC evidence standards revealed a

significant positive effect for reading interventions on reading out-

comes, with a mean effect size of 0.39 (SE ¼.04, p<.001, 95%

CI [0.32, 0.46]). Moderator analyses demonstrated that mean effects

varied across outcome domains and areas of instruction.

ARTICLE HISTORY

Received 29 January 2019

Revised 22 October 2019

Accepted 31 October

2019

KEYWORDS

Reading; response to

intervention; multi-tiered

system of support; Tier 2

intervention; meta-analysis

Multi-tiered systems of support (MTSS), also referred to as Response to Intervention

(RtI), have become routine in American elementary schools, especially in the area of lit-

eracy/reading. In 2010–2011, for example, full implementation of MTSS in Grade 1

reading occurred in 71 percent of schools from a demographically representative sample

(Balu et al., 2015). The massive scale-up of MTSS was fueled by major pieces of federal

legislation, such as the Reading First portion of the No Child Left Behind Act (NCLB,

2002), the Individuals with Disabilities Education Act (IDEA, 2004), and the Every

Student Succeeds Act (ESSA, 2015). ESSA explicitly called for an emphasis on evidence-

based interventions with “strong”and “moderate”levels of evidence, based on the What

Works Clearinghouse (WWC) standards (U.S. Department of Education [U.S. ED],

Institute of Education Sciences [IES], & What Works Clearinghouse, 2013).

With the rapid and widespread implementation of MTSS in reading across schools, a

national study was undertaken to examine the impact of high-quality MTSS in reading

(identified as high-quality by experts and onsite evaluation teams) on the performance

of students in primary grades (Balu et al., 2015). To the surprise of many in the reading

research community, the evaluation found statistically significant negative effects on

Grade 1 reading performance and non-significant impacts on Grades 2 and 3 reading

!2019 Taylor & Francis Group, LLC

CONTACT Kelly Haymond khaymond@inresg.org Instructional Research Group, 4281 Katella Avenue, Suite 205,

Los Alamitos, California 90720, USA.

Instructional Research Group, Los Alamitos, California, USA

JOURNAL OF RESEARCH ON EDUCATIONAL EFFECTIVENESS

2020, VOL. 13, NO. 2, 401–427

https://doi.org/10.1080/19345747.2019.1689591

performance in 146 elementary schools in 13 states using a regression discontinuity

design (Imbens & Lemieux, 2008).

Some in the reading research community raised concerns about the study design, the

use of a regression design to answer questions about effectiveness, how impacts were com-

bined across districts that use different curricula, and the limited monitoring of fidelity of

implementation (e.g., Fuchs & Fuchs, 2017; Gersten, Jayanthi, & Dimino, 2017). Yet, the

findings are hard to ignore. They raise questions about the effectiveness of interventions in

authentic settings, specifically, “How could interventions based on principles from scien-

tific research or those that were actually shown to be evidence-based according to current

standards be ineffectual or even slightly negative in practice?”

There are at least three plausible reasons for the finding. The first is that the evidence

based on the effectiveness of beginning reading interventions may not be as robust or

consistent as previously believed. The second is that there is a body of rigorously con-

ducted research documenting the effectiveness of beginning reading interventions and

the instructional practices used in the research, but these interventions were not imple-

mented with fidelity in practice. The third could be a result of the types of outcome

measures used. Measure types have been found to moderate the size of the impacts

(Elleman, Lindo, Morphy, & Compton, 2009). Students in the national evaluation were

assessed on comprehensive measures of reading performance, as opposed to measures

that measure discrete reading proficiencies, often well aligned with the interven-

tion’s focus.

Rather than listing the names of interventions with rigorous research support, as

done for example on the WWC website, we thought it more important to conduct a

careful examination and rigorous meta-analysis of the body of contemporary research

on reading interventions relevant to MTSS. Doing so would allow us to articulate the

instructional practices and principles that underlie the interventions found to be effect-

ive, and to delineate the outcomes and the grade levels for which there is strong

evidence to support intervention. It also seemed important to examine factors such as

the type of interventionist and the level of support and monitoring provided to the

interventionist.

First, we briefly review prior meta-analyses and literature reviews on this topic.

A Brief Summary of Meta-Analyses and Literature Reviews on Reading

Interventions

Six syntheses of research (either literature reviews or meta-analyses) address the topic of

reading interventions in the primary grades, albeit from different perspectives. Three of

these do not examine the impacts of reading interventions: Al Otaiba and Fuchs (2002),

Stuebing et al. (2015), and Tran, Sanchez, Arellano, and Swanson (2011) explored the

relationship between students’prior reading abilities and other learner characteristics

and their responsiveness to reading intervention.

Slavin, Lake, Davis, and Madden (2011) examined 70 studies of programs geared

toward providing support to struggling readers in elementary school (K-5), including

not only the small-group interventions (20 studies) and one-on-one interventions (20

studies) typical for Tier 2 of MTSS, but also whole-class instructional practices (16

402 R. GERSTEN ET AL.

studies) and computer-based (14 studies) instruction for at-risk readers. The studies in

the review were not evaluated for quality, though the authors did limit their review to

studies that included randomization or matching to form a comparison group, a far

lower standard than the WWC standard used for these studies. Studies were only

included if the program lasted at least 12 weeks. Effects ranged from 0.09 for computer-

based interventions to 0.56 for whole-class interventions. One-on-one and small group

interventions resulted in effects of 0.39 and 0.31, respectively, suggesting that both

approaches show evidence of promise for Tier 2 intervention.

Wanzek and colleagues conducted two meta-analyses of reading intervention studies

involving students in kindergarten through Grade 3. The first (Wanzek et al., 2016) exam-

ined studies of shorter interventions (lasting less than 100 sessions); the latter (Wanzek

et al., 2018) included only longer interventions (lasting more than 100 sessions). The meth-

odology was similar, incorporating RCTs and QEDs only, but not ensuring that the study

met rigorous WWC standards. The first set included 72 studies, published between 1995

and 2013. The second, more recent meta-analysis of longer interventions located only 25

studies published from 1995 to 2015, using similar methodology.

For the first meta-analysis of shorter interventions, the authors classified outcomes

into one of two domains developed to correspond with the simple view of reading

(Francis, Kulesz, & Benoit, 2018). The first addressed the broad area of decoding,

including pre-reading skills (phonological awareness, rhyming, letter identification), as

well as measures of decoding of phonetically regular words and pseudowords, and word

reading and fluency. The second domain was called multicomponent and was essentially

a composite of both listening comprehension and reading comprehension, as well as

oral vocabulary and reading vocabulary. Using a random effects model, the authors

found positive mean effects ranging from 0.54 to 0.62 on the composite outcomes of

foundational reading and reading-related skills and 0.36 to 1.02 on language and com-

prehension measures. There was no evidence that group size, intervention type, grade

level, or interventionist were related to the magnitude of impacts.

Findings from the 2018 meta-analysis of longer interventions produced a mean effect

size of 0.28 when corrected for publication bias. The effects were found to be homoge-

neous, precluding the use of moderator analysis or meta-regression.

Rationale for the Present Meta-Analysis of Tier 2 Reading Interventions in

Grades 1–3

We decided to conduct a new meta-analysis for several reasons. One reason is that we

wanted to use robust variance estimation (RVE; Hedges, Tipton, & Johnson, 2010), a

more contemporary approach that addresses dependent effect sizes arising from multiple

outcomes and comparisons within studies. RVE allows researchers to model all depend-

encies statistically, compared to traditional meta-regression approaches, which address

dependencies by selecting specific comparisons, selecting a single measure, or aggregat-

ing all measures by computing an average effect. Of the six related syntheses we found,

only the most recent meta-analysis by Wanzek and colleagues (2018) used RVE but that

study focused on longer reading interventions. The current study used RVE on a set of

studies that have not been previously examined with this type of analysis.

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 403

A second reason for conducting a new meta-analysis is to limit the grade level to

those in which students are expected to begin reading. Typically, kindergarten interven-

tion studies include very few, if any, reading measures. Instead, they often include meas-

ures of pre-reading or reading-related skills such as listening comprehension, rhyming,

and phonemic awareness. The three previous reviews (Slavin et al., 2011; Wanzek et al.,

2016,2018) included kindergarten studies and measures of pre-reading skills

(i.e., phonological awareness, listening comprehension). As our goal was to determine

whether students receiving the intervention progressed beyond the pre-reading stage

and truly learned to read, we only included studies from Grades 1, 2, and 3 to reflect

the grades included in the national RtI evaluation study (Balu et al., 2015). We also lim-

ited the outcomes to include only measures of reading performance (word reading, pas-

sage fluency, reading comprehension), which is in line with the ESSA standards for

evaluating study outcomes in the primary grades (Center for Research and Reform in

Education & Johns Hopkins University, 2019) and the framework used for assessing stu-

dent reading performance in the Reading First national evaluation (Gamse, Jacob,

Horst, Boulay, & Unlu, 2008). (Reading First used reading comprehension and decoding

to assess the reading performance of struggling students in Grades K–2). We did not

include studies of interventions above Grade 3 because the interventions in Grades 4

and 5, for example, are very different than those in Grades 1–3: They focus more on

comprehension, vocabulary development, and fluency building, and less on decoding.

Finally, given the focus of ESSA (2015) on using interventions with “moderate”to

“strong”levels of evidence from studies that have met WWC standards (Version 3.0;

U.S. ED et al., 2013) for high-quality causal studies, we wanted to conduct a formal

review of the studies using WWC standards and include only those studies in the meta-

analysis that met those standards. The current set of studies is thus a more focused set

that has been screened for the design’s rigor and the findings’trustworthiness.

Purpose of the Present Meta-Analysis

The purpose of this meta-analysis is to synthesize rigorously conducted randomized

controlled trials and quasi-experimental studies on reading interventions for students

who are at risk for reading difficulty in Grades 1–3. The research questions guiding this

project are:

1. Overall, how effective are reading interventions that are designed to improve the

reading outcomes (i.e., reading of words and pseudowords, passage reading flu-

ency, and reading comprehension) of Grades 1–3 students who are considered at

risk for reading difficulties?

2. Do study characteristics (i.e., nature of comparison, design, grade level, risk sta-

tus, and outcome domain/type) or intervention characteristics (i.e., group size,

interventionist, average hours per week of intervention, whether the intervention

was scripted, areas of instruction within the interventions, and support provided

to interventionists) moderate the effect of reading interventions on read-

ing outcomes?

404 R. GERSTEN ET AL.

Method

Literature Search and Selection of Relevant Studies for the Meta-Analysis

The goal of the search was to locate all studies published from January 2002 to March

2017 focused on reading interventions for students in Grades 1–3. The literature search

began with a keyword search of the following databases: Academic Search Premier,

Campbell Collaboration, Educator’s Reference Complete, ERIC, PsycINFO, Social

Sciences Citation Index, and WorldCat. The following keywords were used: reading, lit-

eracy, fluency, decoding, vocabulary, comprehension, reading ability, reading proficiency,

reading achievement, response to intervention and instruction, reading intervention, RtI,

response to intervention, response to instruction, Tier 2 intervention, tutoring, small-group

instruction, one-on-one instruction, intensive intervention, at-risk students, at-risk, contin-

ued risk, non-responders, responders, reading difficulties, reading disabilities, and strug-

gling readers. In addition, we examined all WWC intervention reports in beginning

reading and two relevant WWC Practice Guides, Assisting Students Struggling with

Reading and Improving Reading Comprehension in Kindergarten Through 3rd Grade. We

performed a version of hand-searching known as snowballing, by checking the reference

lists of research syntheses on the topic. Finally, we solicited recommendations from key

researchers in the field on studies likely to be eligible.

Toward the end of the search

and review process, we examined any studies not previously located but included in the

foundational reading practice guide (Foorman et al., 2016) and other meta-analysis and

research syntheses (e.g., Wanzek & Vaughn, 2007; Wanzek et al., 2016).

The search resulted in the identification of 2,423 publications. All studies were

screened for eligibility based on the title, keywords, and abstracts. The studies were then

examined to determine whether they met the following inclusion criteria:

(a) Location. To be eligible, studies had to take place in the United States.

(b) Publication date. We limited the search to studies published between 2002 and

2017. The 2002 start date was chosen because it marks a transition in how teachers

approached reading interventions. Beginning circa 2002, initiatives in states such as

Texas and California (and numerous others) were reinforced by the Reading First pro-

gram’s (NCLB, 2002) emphasis on small-group preventative reading interventions based

on early screening in the primary grades. Research after this date focused more on the

effectiveness of these preventative interventions. Therefore, studies published after 2002

seemed most relevant to the research questions.

intervention: that is, preventative instructional practices and activities designed to help

students who are considered at risk for reading difficulties (e.g., Gersten, Compton,

et al., 2009). The interventions had to be at least 8 h in duration and could be provided

to small groups of students or individually to one student. We did not exclude any stud-

ies based on the size of the small groups. The interventions could be conducted at

school, either during school or after school, or at non-school clinics. The intervention

could be conducted during the school year or during summer break. They could be

This procedure is documented in Gersten et al. (2017).

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 405

delivered by teachers, researchers, tutors, volunteers, parents, or paraprofessionals, pro-

vided they followed a specific intervention program or a clearly outlined approach.

Although we included interventions that taught phonological awareness, we did not

include interventions that focused solely on phonological awareness without providing

any instruction on reading words and/or pseudowords. We also did not include

whole-class (Tier 1) interventions (even if it was noted that the entire class or school

was considered at risk for reading failure) or intensive Tier 3 interventions that were

meant to meet the individual needs of students who failed to benefit from evidence-

based interventions (e.g., Fuchs, Fuchs, et al., 2008; Gersten, Compton, et al., 2009). In

other words, studies that selected only students who were nonresponders to a Tier 1 or

2 intervention were excluded. Denton et al. (2013), for example, examines the impact of

an intervention for students who have failed to show progress in both Tier 1 and Tier 2

interventions and, therefore, was excluded from the meta-analysis. Finally, we excluded

interventions that were delivered only at home, conducted in a language other than

English, or included only a professional development component for teachers on the

topic and lacked a specific intervention or intervention approach.

(d) Study design. Only RCTs and QEDs were included.

(e) Sample. The participants had to be students in Grades 1–3 who were considered

at risk for reading difficulties. To be considered at risk, students had to have (a) a score

on a valid screener or screening battery indicating that the student was likely to be at

risk for possible reading failure at the end of the school year or (b) a score on a norm-

referenced standardized test (such as Woodcock Reading Mastery) indicating that the

student performed below the 40th percentile at the beginning of the school year or at

the end of the previous school year. If a study sample included students from grades

that were outside the scope of the review (e.g., Grades K, 4, or 5), then the study had to

meet one of the following criteria: (a) the study findings disaggregated the results of stu-

dents in eligible grades or (b) students in eligible grades represented over 50 percent of

the aggregated mixed-age sample.

(f) Outcomes. Studies had to include outcome measures of reading proficiencies and

skills (i.e., word reading, passage fluency, reading comprehension, or overall reading

achievement). Studies that only included measures of pre-reading skills such as phon-

emic awareness, rhyming, and oral comprehension were excluded.

Of the 2,423 publications that were examined, 54 met the initial criteria for inclusion.

See Figure 1 for a pictorial representation of the screening process.

Coding of Studies

The 54 publications that met initial inclusion criteria were coded in 3 phases. In Phase

1, publications were coded for quality of research design. In Phase 2, publications were

coded to identify study characteristics and intervention characteristics. Finally, in Phase

3, publications were coded to explore the areas of reading covered in the interven-

tion lessons.

406 R. GERSTEN ET AL.

Phase 1 Coding: Quality of Research Design

In the first phase, two members of the research team (who are certified WWC

reviewers) independently examined each publication for the study design’s strength and

quality using WWC Procedures and Standards Handbook (Version 3.0; U.S. ED et al., 2013).

Only studies that met WWC standards (with or without reservations) were included in

Phases 2 and 3.

Several publications we reviewed included more than one study (e.g., Denton,

Fletcher, Taylor, Barth, & Vaughn, 2014; Lane, Pullen, Hudson, & Konold, 2009).

For the purposes of this project, we defined a study as any comparison with a

unique treatment group compared to a unique, business-as-usual control condition.

Studies comparing the impacts of two researcher-controlled interventions, as well as

comparisons of variations in treatments, were excluded. For example, in Lane et al.

(2009), researchers report the effects of an intervention, as well as three variations of

that intervention, when compared with that of a business-as-usual control condition.

We considered each intervention and variation as a unique treatment group, and

each comparison of a unique treatment group with a business-as-usual control as a

separate study; therefore, we counted four studies in this publication (i.e., T0 vs. C,

T1 vs. C, T2 vs. C, and T3 vs. C). The comparisons of each variation in treatment

with the others were excluded. Studies of variations in treatments focus on a much

more precise research question than the effectiveness of reading interventions. The

aThe study is a randomized controlled trial with high attrition or a quasi-experimental design

study with analysis groups that are not shown to be equivalent. bThere was only one unit

assigned to at least one of the conditions, or the intervention was always used in combination

with another intervention.

Eligibility

Screening

What Works

Clearinghouse

Evidence Reviews

Included in Meta-

Analysis

Publications screened for eligibility:

2,423

Publications that were excluded because they

compare two unique treatments: 9

Publications and study comparisons

used in meta-analysis:

25 publications include 33 studies

Publications that met screening

criteria: 54

Publications coded for quality: 54

Publications excluded at screening: 2,369

•Not conducted in the U.S.

•Not published between 2002–2017

•No eligible reading intervention

•Not an eligible study design, sample, or outcome

Publications that met WWC Evidence

Standards: 34

Publications that did not meet WWC Evidence

Standards: 20

•Design qualitya

•Confounding factorb

Figure 1. Literature search, screening, and reviewing.

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 407

framework of this meta-analysis could not account for the more experimental manip-

ulations of specific components.

In total, of the 54 publications reviewed, 25 publications included 33 separate studies

that met standards (with or without reservations). See Figure 1.

Phase 2 Coding: Study and Intervention Characteristics

For studies that met WWC group design standards (with or without reservations), we

coded the following study characteristics: nature of the comparison, design (either RCT

or QED), grade level, participants’risk level, and outcome domain and type. Coding of

the intervention characteristics addressed the following: What was the size of the inter-

vention group, who implemented the intervention (i.e., interventionist), was the inter-

vention scripted, was monitoring and feedback provided to the interventionist, and how

many hours of instruction were provided per week. See Table 1 for the operational defi-

nitions. Two members of the researcher team coded all study and intervention charac-

teristics. The researchers discussed and rectified any discrepancies. After the initial

coding, a third researcher coded a randomly selected 20 percent of the studies for reli-

ability purposes. Reliability was 90.6 percent.

Phase 3 Coding: Area/Focus of Instruction

We examined descriptions of the interventions provided in the publications and cataloged

the interventions in two ways: (a) the target area of instruction, and (b) the focus of instruc-

tion. Each study was examined to determine if any of the following reading areas—phono-

logical awareness, decoding, encoding (spelling), fluency, vocabulary, comprehension, and

writing—were addressed during the intervention. If an intervention covered a reading area

in any manner, minimally or extensively, then the study was coded for that area.

During our coding, we noticed that some studies covered the main areas of reading

minimally during the intervention while others gave evidence of extended explicit

instruction. For instance, many studies mentioned that they included reading compre-

hension in the lesson, but then only described that they asked comprehension questions

as or after students read a passage, without providing any explicit instruction in com-

prehension strategies. Consequently, instruction in each intervention was further exam-

ined to determine whether the reading areas were taught routinely and explicitly. If so,

the studies were also coded for having a focus in that area of reading. This additional

level of coding—the focus of instruction—was limited to decoding, fluency, reading

comprehension, and vocabulary.

An example to illustrate how the research team determined whether a component

was a focus of instruction is the coding of the reading comprehension component in

Denton et al. (2014). This study consisted of two treatment conditions, explicit instruc-

tion and guided reading, and a control group. Comprehension was coded as a focus of

instruction in the explicit instruction condition but not the guided reading condition. In

the former, instruction consisted of teachers modeling comprehension strategies using

“think-alouds”and providing specific feedback when students practiced in small groups.

Although the guided reading condition included discussion activities, teachers never

408 R. GERSTEN ET AL.

modeled or provided any clear guidance on how and when to use various strategies to

discern a cause-effect relationship or for succinct retelling.

Coding of studies during this phase was done collaboratively by two members of the

research team (who are experts in beginning reading). Reliability was calculated on 20

percent of randomly selected studies. Reliability was 85.71 percent.

Data Analysis

Calculation of Effect Sizes

To determine each intervention’s impact, we calculated the average effect size for

each domain of reading (i.e., word and pseudoword reading, passage reading

Table 1. Study and intervention characteristic definitions.

Moderator Levels and operational definitions

Design RCT ¼Randomized control Trial; QED ¼Quasi-experimental design.

Grade level

1¼First grade only; 2/3 ¼Second- and third-grade combination class.

Nature of the comparison group Core reading instruction only ¼Business-as-usual, whole-class reading

instruction with no additional support (i.e., Tier 1); School-provided

intervention ¼Reading interventions typically provided by the

school/district (i.e., some form of preventative intervention provided

in addition to core Tier 1 reading).

Participants’risk level

At risk ¼Only students in the 25th percentile or lower on a

standardized norm-referenced screener; Minimal risk ¼Students

considered potentially at risk who score below the 40th percentile

on a standardized norm-referenced screener.

Outcome measure domains Word or pseudoword reading ¼e.g., TOWRE &Woodcock-Johnson Word

Attack; Passage reading fluency ¼e.g., AIMSweb Standard Reading

Assessment Passages; Reading comprehension ¼Woodcock Reading

Mastery Tests (WRMT) Passage Comprehension subtest, GRADE

reading comprehension subtest.

Measure type

Standardized tests ¼Existing measures administered, scored, and

interpreted in the same way for all test-takers; Researcher-developed

measures ¼Only those the researcher developed for the study.

Group size Small group ¼groups of more than one student; One-on-one ¼1

student with 1 interventionist.

Interventionist

Certified teacher ¼Had a teaching credential, even if they were not

employed as a full-time teacher at the schools where the studies

took place; Paraprofessional ¼Anyone who worked or volunteered at

the school as part of the study and had no teaching credential;

Research staff ¼Were typically graduate students at a university.

Avg. hrs./week of instruction Low ¼Less than 1.5 h per week; Medium ¼1.5 to 2.0 h per week; High

¼2.0 or more hours per week.

Scripted Yes ¼Interventionist provided with step-by-step instructions on what to

say and do during each session; No ¼Interventionist was not

provided with step-by-step instructions.

Monitoring and feedback Yes ¼Interventionists were observed conducting the intervention and

were provided feedback after they were observed;

No ¼Interventionists were not observed and no feedback

was provided.

If a study included students from more than one grade (e.g., from Grades 1 and 2), then the study was assigned to

the grade level of the majority of the sample.

If the authors did not provide a percentile on a nationally normed test to describe the at-risk sample, the study was

not coded for this variable.

Measures of pre-reading skills such as phonological awareness, rhyming, and letter naming were excluded, as were

measures of listening comprehension, spelling, and writing.

Studies with a mix of interventionists (e.g., both teachers and paraprofessionals) were coded by the most preva-

lent category.

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 409

fluency, vocabulary, and reading comprehension). The effect size was calculated for

each outcome using the means and pooled standard deviations for intervention and

comparison groups, and corrected for small sample bias by Hedges (1981) proce-

dures. In cases where means and standard deviations were not available, the tor F

statistics and the treatment and comparison group sample sizes were used to calcu-

late Hedges’g.

Meta-Analytic Procedures

To account for dependencies in our data, we used random effects robust variance

estimation (RVE) techniques (Hedges et al., 2010). RVE permits the comparison of

effect sizes across studies in which multiple, dependent effect sizes are drawn from

the same sample. Random effects analyses were conducted using the statistical soft-

ware Stata (StataCorp, 2015) and the “Robumeta”package (Hedberg, 2011), a

macro that applies the RVE techniques. In RVE, the mean correlation between all

pairs of effect sizes within a study (q) must be specified to estimate the study

weights and calculate the between-study variance. We used a qvalue of .80 to esti-

mate the between-study variance and then conducted sensitivity analyses using q

values of 0 to .90.

The small-sample correction developed by Tipton (2015) was

implemented in Robumeta for all models, as RVE results have been shown to

inflate the Type I error rate when the meta-analysis includes fewer than 40 studies

(Tipton, 2015).

We estimated a series of meta-regression models using RVE. First, we ran an inter-

cept-only model in which the estimate for the constant represented the average weighted

effect size across all 33 studies (Tanner-Smith & Tipton, 2014). The Robumeta package

calculated the following indices of heterogeneity: Qstatistic and its p-value, estimates of

(the percentage of between-study heterogeneity not due to chance variation in

effects), and s

(the true variance in the population of effects).

Next, we examined the role of moderators such as group size, grade level, and level

of support provided to interventionists. Though one meta-regression model with covari-

ates for all moderators is preferable, the number of studies that met the inclusion crite-

ria was small, and not every study included information that permitted coding of all

moderators. Thus, this approach was not taken because results would be uninterpretable

due to insufficient degrees of freedom. Instead, we examined potential moderators using

separate RVE meta-regression models with only the moderator of interest entered as a

predictor. We interpret these results with caution due to potential confounding effects

of other moderators that are unaccounted for in these single-predictor models.

Moreover, a small number of single-predictor models remained underpowered (df <4),

likely a result of large imbalances in the data (Tipton, 2015).

The moderator variables were dummy coded and included as covariates in each

model. To estimate a mean effect size for each level of the moderator variables (i.e.,

Hedges et al. (2010) demonstrated that the value selected for

generally does not affect results much and

recommended implementing a sensitivity analysis by analyzing models with varying

values. We conducted sensitivity

analyses using

values of 0 to .90 and found no meaningful differences in the results across models, indicating that

our findings were robust across estimates of

410 R. GERSTEN ET AL.

RCT and QED are levels of the design moderator variable), intercept-only models also

were run for each level of the moderator. The p-value for determining statistical signifi-

cance in each of the moderator analyses was set to p<.05.

Publication Bias

The potential impact of publication bias using the trim-and-fill methodology (Duval &

Tweedie, 2000) was examined by constructing a funnel plot of the effect sizes and not-

ing any asymmetry in the distribution of effects. The plot was then systematically

trimmed, removing the effect sizes causing the asymmetry, and filled in with any effect

sizes that may have been missing from unpublished studies that resulted in small and

non-significant treatment effects. The analysis estimated the number of missing effect

sizes and recalculated the overall mean effect size in a way that reflects the presence of

these missing effects.

Results

Study Characteristics

A total of 33 studies from 25 publications met WWC group design standards and were

included in the final meta-analysis. These 33 studies spanned 13 years (2004–2016) and

provided a total of 128 effect sizes. The sample sizes in the studies ranged from 21 to

6,888 students. Total sample size across all studies was 11,737 students (median sample

size ¼89).

Thirty of the studies were RCTs; the remaining three were QEDs. The comparison

condition for the majority of studies (k¼21) was Tier 1 core classroom instruction (i.e.,

nothing other than what the classroom teacher chose to provide). In the remaining 12

studies, the comparison was typical school- or district-provided intervention. Twenty-

two studies were conducted in Grade 1, and 11 studies were in Grades 2 and/or 3. Only

16 studies provided the information necessary for coding for participants’risk level.

those, 12 studies included students in the minimal risk category, and only 4 studies

included students in the at-risk category (i.e., 25th percentile or lower). The most com-

mon outcome domain was word or pseudoword reading (included in all but one of the

33 studies). Nineteen studies included outcomes in reading comprehension, 16 included

passage reading fluency outcomes, and only two studies included outcomes in vocabu-

lary. The study characteristics for each study included in the meta-analysis are presented

in Table 2.

Intervention Characteristics

Interventions were delivered to students in one-on-one settings in 21 of the studies and

in small-group settings (group sizes in the included studies ranged from 2 to 5 students)

If the authors did not provide a percentile on a nationally normed test to describe the at-risk sample, the study was

not coded for this variable. Across the studies, a wide range of screening measures and operational definitions were

used, and the screeners were typically not nationally normed.

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 411

Table 2. Study and intervention characteristics for each study included in the meta-analysis.

Study characteristics Intervention characteristics

Author Design

Comparison

Grade At-risk level

Outcome domain

Grouping

Interventionist

Scripted

Avg. hrs./wk.

Monitoring

Allor and McCathren (2004) Study 1 RCT CR 1 RC 1:1 P Y L Y

Allor and McCathren (2004) Study 2 RCT CR 1 PF, RC, WR 1:1 P Y L Y

Berninger, Abbott, Vermeulen, and Fulton (2006) RCT CR 2/3 A WR SG R N M N

Blachman et al. (2004) RCT SI 2/3 A PF, RC, WR 1:1 CT N H Y

Case et al. (2010) RCT CR 1 WR SG R Y M Y

Case et al. (2014) RCT CR 1 PF, WR SG R Y M Y

Denton et al. (2010) RCT SI 1 RC, WR SG CT N H Y

Denton et al. (2014) RCT SI 2/3 M PF, RC, WR SG P Y H Y

Fien et al. (2015) RCT SI 1 M PF, WR SG P N H Y

Fuchs, Compton, Fuchs, Bryant, and Davis (2008) RCT CR 1 WR SG R N H N

Gunn et al. (2005) RCT CR 2/3 PF, RC, WR SG P Y M Y

Jacob, Armstrong, Bowden, and Pan (2016) RCT SI 2/3 PF, RC, WR 1:1 P Y L Y

Jenkins, Peyton, Sanders, and Vadasy (2004) QED CR 1 A RC, WR 1:1 P Y M Y

Lane et al. (2009) RCT CR 1 M WR 1:1 R N M N

Lane et al. (2009) RCT CR 1 M WR 1:1 R N L N

May, Sirinides, Gray, and Goldsworthy (2016) RCT SI 1 RC, WR 1:1 CT N H N

O’Connor et al. (2010) RCT CR 2/3 PF, RC, WR 1:1 P N L N

Pullen, Lane, and Monaghan (2004) RCT CR 1 M WR 1:1 P N L N

Scanlon, Vellutino, Small, Fanuele, and Sweeney (2005) RCT SI 1 RC, WR 1:1 CT H Y

Scanlon et al. (2005) RCT SI 1 RC, WR 1:1 CT H Y

Schwartz (2005)

RCT CR 1 RC, WR 1:1 CT N H N

Smith et al. (2016) RCT SI 1 M PF, WR SG P N H Y

Vadasy and Sanders (2011)

RCT CR 1 PF, RC, WR 1:1 P Y L Y

Vadasy, Sanders, and Peyton (2006) QED

QED CR 2/3 M PF, RC, WR 1:1 P Y M Y

Vadasy et al. (2006) RCT

RCT CR 2/3 M PF, RC, WR 1:1 P Y M Y

Vadasy et al. (2007)

RCT CR 2/3 M PF, WR 1:1 P Y M Y

Vellutino and Scanlon (2002) RCT SI 1 A WR 1:1 CT N H Y

Wang and Algozzine (2008) RCT CR 1 RC, WR SG P Y L N

Wanzek and Vaughn (2008)

QED SI 1 PF, WR SG R Y H Y

Identifies studies also included in Wanzek et al. (2016).

RCT ¼randomized controlled trial. QED ¼quasi-experimental design.

CR ¼core reading instruction. SI ¼school-provided intervention.

A¼at risk, a sample with students only in the 25th percentile or lower. M ¼minimal risk, a sample that included students below the 40th percentile. Blank ¼authors did not provide

a percentile or the information to calculate the percentile from a nationally normed test given for screening purposes.

WR ¼Word & Pseudoword Reading. RC ¼Reading Comprehension. PF ¼Passage Fluency.

SG ¼small group.

P¼paraprofessional. R ¼researcher. CT ¼certified teacher.

Y¼scripted. N ¼not scripted.

L¼low, M ¼medium, H ¼high.

Y¼support included. N indicates that monitoring and feedback was either not included or not reported.

412 R. GERSTEN ET AL.

in 12 studies. The interventions were implemented by a researcher in nine studies, a

certified teacher in seven studies, and a paraprofessional in 15 studies. In 21 studies,

additional support in the form of monitoring and feedback was provided to the inter-

ventionists. Nearly half of the studies (k¼15) included scripted interventions. The aver-

age hours per week of intervention ranged from less than an hour (i.e., 45 min) to

4.17 h per week (median ¼2). The intervention characteristics for each study included

in the meta-analysis are described in Table 2.

Instructional Area/Focus of the Intervention

All the studies included in the meta-analysis examined interventions that focused on

building students’reading skills in more than one instructional area (e.g., decoding, flu-

ency, comprehension). In many respects, the interventions appeared to be similar to

each other, mainly addressing decoding and fluency, while also attending to one or

more areas of reading instruction, such as encoding, comprehension, vocabulary, phono-

logical awareness, and writing.

All but two studies (k¼31) addressed decoding. Many studies also included instruc-

tion in passage fluency (k¼29) and encoding (k¼23). Over 50 percent of the studies

addressed phonological awareness (k¼19) and comprehension (k¼18). Vocabulary

(k¼13) and writing (k¼7) were addressed less frequently.

All studies with decoding and fluency were coded for both area of instruction as well

as the focus of instruction. Only nine of the 18 studies that were coded for comprehen-

sion were also given a Yes under the focus code as they showed evidence of systematic

teacher-led explicit instruction in comprehension that went beyond asking literal ques-

tions, questions about title and pictures, or monitoring comprehension strategies. Only

one of the nine studies coded for vocabulary was also given a Yes for focus. Vocabulary

instruction rarely involved explicit teaching and interaction; it typically involved defin-

ing words if students asked, or asking students to look at pictures to derive the meaning

of a word.

Meta-Analytic Results

The meta-analysis included 128 effect sizes from 33 studies. See Table 3 for effect sizes

(Hedges’g) for all outcome measures by domain for each study. Effect sizes ranged

widely, from –0.20 to 1.37. The mean effect size for these studies was 0.39 (SE ¼.04,

p<.001, 95% CI [0.32, 0.46]), indicating that the reading interventions were generally

effective across students, settings, and measures. As expected, treatment effects varied

considerably. The I

estimate of the percentage of between-study heterogeneity not due

to chance was 50.75%, with a s

estimate of the true variance in the population of

effects of 0.02.

Moderators

Eleven categorical moderator analyses of study characteristics (six variables) and inter-

vention characteristics (five variables) were individually tested using each as a single

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 413

Table 3. Outcomes and effect sizes.

Author Total NOutcome domain

Measure type

Allor and McCathren (2004) Study 1 86 RC S 0.50

Allor and McCathren (2004) Study 2 157 WR S 0.05, 0.13, 0.33, 0.44, 0.78

RC S "0.16

PF R 0.13

Berninger et al. (2006) 93 WR S 0.35

Blachman et al. (2004) 69 WR S 0.74, 0.87

RC S 0.53

PF S 0.70

Case et al. (2010) 30 WR S 0.48, 0.73, 0.73

Case et al. (2014) 123 WR S 0.02, 0.17, 0.21, 0.23

PF S 0.20

Denton et al. (2010) 422 WR S 0.42

RC S 0.51

Denton et al. (2014) 103 WR S 0.34, 0.40, 0.50

RC S 0.08, 0.13

PF S 0.16

Denton et al. (2014) 112 WR S 0.31, 0.50, 0.63

RC S 0.29, 0.46

PF S 0.45

Fien et al. (2015) 239 WR S 0.38, 0.45

PF S 0.30

Fuchs et al. (2008) 64 WR S 0.26, 0.26, 0.38, 0.46, 0.65

Gunn et al. (2005) 245 WR S 0.30, 0.52

RC S 0.32

PF S 0.24

Jacob et al. (2016) 1,166 WR S 0.11

RC S 0.10

PF S 0.09

Jenkins et al. (2004) 99 WR S 0.37, 0.50, 0.52, 0.73, 0.76, 1.12

RC S 0.74

Lane et al. (2009) 41 WR R 0.64, 0.71

WR S 1.24

Lane et al. (2009) 42 WR R 0.24, 0.29

Lane et al. (2009) 43 WR R 0.39, 0.55

Lane et al. (2009) 46 WR R 0.52, 0.59

WR S 1.02

May et al. (2016) 6,888 WR S 0.41

RC S 0.42

O’Connor et al. (2010) 40 WR S 0.10, 0.56

RC S 0.48, 0.53

PF S 0.60, 0.75, 0.76, 0.87

O’Connor et al. (2010) 43 WR S 0.25, 0.57

RC S 0.37, 0.44

PF S 0.81, 0.84 0.93, 1.33

Pullen et al. (2004) 47 WR R 0.24, 0.81

WR S 0.54, 0.59

Scanlon et al. (2005) 114 WR S 0.31, 0.55

RC S 0.41

Scanlon et al. (2005) 117 WR S 0.51, 0.62

RC S 0.35

Schwartz (2005) 74 WR S 0.93, 1.37

RC S 0.14

Smith et al. (2016) 743 WR S 0.19

PF S 0.12

Smith et al. (2016) 729 WR S 0.23, 0.32

Smith et al. (2016) 749 WR S 0.24

PF S 0.18

Vadasy and Sanders (2011) 89 WR S 0.51

RC S 0.29

PF R 0.69

Vadasy et al. (2006) QED 31 WR S 0.61, 0.72

(continued)

414 R. GERSTEN ET AL.

predictor in the meta-regression models. Some of these variables emerged as significant

moderators of the relationship between reading interventions and effect sizes on meas-

ures of students’reading proficiency.

Findings are summarized in Table 4. The coefficients from the intercept-only models

should be interpreted as the weighted effect size for studies with that level of moderator,

and statistically significant results indicate that the mean effect size for studies with that

level of the moderator is significantly different from zero.

Study Characteristics. The outcome domain for the measures, when comparing the

three areas of reading

(word or pseudoword reading, reading comprehension, passage

reading fluency) significantly moderated effect size. On average, outcomes in the word

or pseudoword reading domain yielded the largest effect size (b¼0.41 [0.33–0.50],

p<.001, k¼32). Reading comprehension domain outcomes produced a slightly smaller

effect (b¼0.32 [0.20–0.43], p<.001, k¼19), followed by passage reading fluency out-

comes, which generated the smallest effect size (b¼0.31 [0.17–0.44], p<.001, k¼16).

The only significant difference, however, was on outcomes in the word or pseudoword

reading domain, which yielded significantly larger effect sizes than outcomes in the

domains of reading comprehension or passage reading fluency (b¼0.10 [0.00–0.19],

p¼.049, k¼32)

Study characteristic variables that did not significantly moderate the effect size

included grade level (b¼0.06 ["0.26–0.15], p¼.542, k¼33), research design (b¼"0.06

["1.03–0.92], p¼.823, k¼33), the nature of the comparison group (b¼"0.11

["0.25–0.04], p¼.139, k¼33, participants’risk level (b¼0.13 ["0.18–0.44], p¼.330,

k¼16), and standardized measures versus researcher-developed measures (b¼0.11

["0.07–0.28], p¼.189, k¼33).

Table 3. Continued.

Author Total NOutcome domain

Measure type

RC S 0.50

PF R 0.81

Vadasy et al. (2006) RCT 21 WR S 0.67, 0.75

RC S 0.21

PF R 0.55

Vadasy et al. (2007) 43 WR S 0.47

PF S 0.52

Vellutino and Scanlon (2002) 118 WR S 0.38

Wang and Algozzine (2008) 139 WR S "0.03, 0.39, 0.45

RC S 0.17

Wanzek and Vaughn (2008) 50 WR S 0.12, 0.18

PF S "0.20

WR ¼Word & Pseudoword Reading. RC ¼Reading Comprehension. PF ¼Passage Fluency.

R¼researcher developed. S ¼standardized.

One effect size per measure. If four effect sizes are listed, it means four measures were used in this outcome domain.

The analysis could not include a meta-analysis of effects from outcomes in the vocabulary domain due to the small

number of studies (k¼2).

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 415

Intervention Characteristics. None of the intervention characteristics led to significant

moderator effects. These included interventions implemented by researchers (b¼"0.01

["0.21–0.20], p¼.939, k¼33), interventions implemented by certified teachers

(b¼0.14 [0.00–0.28], p¼.053, k¼33), interventions implemented by a paraprofessional

(b¼"0.12 ["0.25–0.02], p¼.086, k¼33), if an intervention was scripted (b¼"0.13

["0.28–0.02], p¼.094, k¼31), whether or not monitoring and feedback was provided

(b¼"0.12 ["0.26–0.03], p¼.103, k¼33), or average hours per week of intervention

for the low (b¼"0.08, ["0.30–0.15], p¼.464, k¼33), medium (b¼0.04, ["0.14–0.23],

p¼.610, k¼33), or high (b¼0.03, ["0.13–0.18], p¼.722, k¼33) categories, and

grouping (either small group or one-on-one interventions) (b¼"0.13 ["0.27–0.01],

p¼.075, k¼33). When tested per grade level, however, grouping was a significant mod-

erator for Grade 1 but not Grades 2 and 3. For Grade 1 specifically, effects were larger

if the intervention was delivered to students individually rather than to groups of stu-

dents, (b¼"0.16 ["0.32–0.01], p¼.042, k¼22).

Area/Focus of Instruction. Given that all the studies examined interventions with mul-

tiple areas of instruction, we tested a meta-regression analysis model using all seven

areas of instruction. Analyzing the areas of instruction simultaneously allowed us to

examine each area’smoderatinginfluencewhileholdingtheotherareasconstant.Of

the seven areas of instruction, phonological awareness, encoding (spelling), and writ-

ing appeared to be significant moderators of effect sizes when holding all other areas

of instruction constant (see Table 4). Interventions that included phonological aware-

ness tended to result in smaller effects across the word or pseudoword reading, read-

ing comprehension, and passage reading fluency outcomes (b¼"0.19 ["0.32, "0.05],

p¼.010). However, this study did not specifically address phonological awareness out-

comes, so we do not know the impact on those. In contrast, providing instruction in

encoding (b¼0.18 [0.01, 0.35], p¼.045) or writing (b¼0.18 [0.02, 0.34], p¼.028)

yielded significantly higher effect sizes when they were included as a component in

the intervention.

We were also interested in whether studies that included a focus in a particular

area—that is, more in-depth and explicit instruction—significantly moderated impacts.

Effect sizes were not significantly associated with providing a more in-depth instruc-

tional focus in decoding (b¼"0.17 ["0.89–0.55], p¼.216, k¼33), fluency (b¼"0.04

["0.34–0.26], p¼.694, k¼33), or comprehension (b¼"0.06 ["0.30–0.17], p¼.572,

k¼33). Note that vocabulary was not examined here as a focus due to the limited num-

ber of studies.

Publication Bias

Finally, to determine whether the findings suffered from publication/small-study

bias, we implemented the trim-and-fill method (Duval & Tweedie, 2000). The

results indicated that 23 effect sizes were estimated missing from the current meta-

analysis of 128 total effects. Including these in the random-effects model would

minimally decrease the mean effect size from g¼0.39 to g¼0.32 (p<. 001, 95%

CI [0.27, 0.36]).

416 R. GERSTEN ET AL.

Table 4. Moderator analysis.

Moderator Coeff SE 95% CI p df QI

nkρ

Grade level

1 vs. 2/3 0.06 0.09 (−0.26, 0.15) 0.542 12 54.82 41.63 0.02 128 33 .8

Grouping

Small Group vs. 1:1 −0.13 0.07 (−0.27, 0.01) 0.075 20 64.31 50.24 0.02 128 33 .8

Research design

RCT vs. QED −0.06 0.23 (−1.03, 0.92) 0.823 <4 66.80 52.10 0.02 128 33 .8

Interventionist

Paraprofessional vs. Other −0.12 0.06 (−0.25, 0.02) 0.086 19 49.68 35.59 0.01 128 33 .8

Certified Teacher vs. Other 0.14 0.06 (0.00, 0.28) 0.053 8 50.87 37.09 0.02 128 33 .8

Researcher vs. Other −0.01 0.09 (−0.21, 0.20) 0.939 10 67.00 52.24 0.02 128 33 .8

Scripted intervention

Scripted vs. Non-Scripted −0.13 0.07 (−0.28, 0.02) 0.094 18 50.53 40.63 0.02 122 31 .8

Nature of the comparison group

SI vs. CR

−0.11 0.07 (−0.25, 0.04) 0.139 24 65.88 51.43 0.02 128 33 .8

At-risk sample

At risk vs. Minimal risk 0.13 0.12 (−0.18, 0.44) 0.330 5 16.24 —0.01 58 16 .8

Hours of treatment

Low vs. Other −0.08 0.10 (−0.30, 0.15) 0.464 10 53.19 39.84 0.02 128 33 .8

Medium vs. Other 0.04 0.08 (−0.14, 0.23) 0.610 10 66.89 52.16 0.02 128 33 .8

High vs. Other 0.03 0.07 (−0.13, 0.18) 0.722 21 57.57 44.42 0.02 128 33 .8

Provided monitoring/feedback

Yes vs. No −0.12 0.07 (−0.27, 0.03) 0.103 12 54.27 41.04 0.02 128 33 .8

Measure type

Standardized vs. Researcher 0.11 0.07 (−0.07, 0.28) 0.189 7 66.36 51.78 0.02 128 33 .8

Outcome domain

Word/Pseudoword Reading vs. Other 0.10 0.05 (0.00, 0.19) 0.049 19 66.10 51.59 0.02 128 33 .8

Passage Reading Fluency vs. Other −0.09 0.06 (−0.23, 0.05) 0.188 10 60.01 46.68 0.02 128 33 .8

Reading Comprehension vs. Other −0.05 0.05 (−0.23, 0.05) 0.308 13 66.58 51.94 0.02 128 33 .8

Instructional Area

Constant 0.49 0.15 (−0.02, 1.00) 0.055 <4 44.07 27.39 0.02 128 33 .8

Decoding 0.12 0.11 (−0.53, 0.29) 0.356 <4 .8

Passage fluency −0.00 0.10 (−0.27, 0.26) 0.974 5 .8

Reading comprehension −0.08 0.07 (−0.23, 0.06) 0.234 13 .8

Vocabulary 0.08 0.09 (−0.12, 0.27) 0.414 11 .8

Phonological awareness −0.19 0.06 (−0.32, −0.05) 0.010 13 .8

Encoding 0.18 0.07 (0.01, 0.35) 0.045 8 .8

Writing 0.18 0.07 (0.02, 0.34) 0.028 8 .8

Note. Coeff = coefficient; SE = standard error; CI = confidence interval; p= significance; df = degrees of freedom; Q = test of homogeneity of effect sizes; I

= measures of effect size

variability; τ

= between-study variance; n= number of effect sizes; k= number of studies; ρ= corrected correlation. In all RVE models, we used a ρvalue of .80 to estimate the

between-study variance; —= could not be estimated. Bolded coefficient values indicate statistically significant estimates at p< .05.

CR = core reading instruction. SI = school-provided intervention.

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 417

Discussion

Results from this meta-analysis of 33 studies of reading interventions conducted

between 2002 and 2017 reveal significant, positive effects on a range of reading out-

comes. The significant mean effect size (Hedges’g) across 33 studies was 0.39 (p<.001),

indicating that students from Grades 1, 2, and 3 who score in the at-risk category on a

screening battery or on a normed test do, on average, benefit from the set of reading

interventions studied. This leads us to conclude that the research base underlying read-

ing interventions is sound and not the primary reason for the lack of impacts in the

national RtI evaluation (Balu et al., 2015), which found null, or in one case negative,

impacts on reading outcomes for students at or near the cut point on screening.

Mean effect sizes (Hedges’g) for each outcome domain ranged from 0.41 in the area

of word or pseudoword reading to 0.32 in comprehension and 0.31 in passage reading

fluency. All were statistically significant at p<.001. Note that the mean effect size was

the highest in the outcome domain of word and pseudoword reading. This is unsurpris-

ing, given the large body of evidence supporting the use of various forms of systematic,

explicit, small-group instruction in phonemic awareness, phonics instruction, and sight

word reading to help students who are likely to fall behind when experiencing more

traditional instruction (e.g., Gersten, Compton, et al., 2009; National Institute of Child

Health and Human Development [NICHD], 2000).

The reading interventions examined showed many commonalities. Every intervention

addressed multiple aspects of foundational reading—phonological awareness, decoding,

passage reading fluency, encoding (spelling) and, on occasion, writing. Nearly all inter-

ventions addressed comprehension in some fashion, although few provided much in the

way of detail. Vocabulary and comprehension instruction were rarely emphasized.

Virtually all interventions included systematic, explicit instruction. Typically, this

occurred during phonics/word reading skills and passage reading fluency, often with

some activities geared toward fluency building and phonological awareness.

Interventions that included instruction on phonological awareness were associated

with significantly smaller effects, whereas interventions that addressed encoding or writ-

ing yielded significantly higher effect sizes. Perhaps focusing on pre-reading skills

such as phonological awareness after students have started to learn to decode is coun-

ter-productive as it takes time and focus away from gaining proficiency in decoding

skills. We speculate that an encoding component may help reinforce phonics rules and

decoding, and we note that this has been a feature of some core reading programs and

intervention programs.

Variables for Future Exploration

The percentage of between-study heterogeneity not due to chance was 50.75%, suggest-

ing both a good deal of variance in the pattern of effects and the need to use moderator

analyses to begin to understand salient factors. Although many of the moderators

explored in the current meta-analysis were non-significant (p>.05), future research is

needed to explore aspects of the interventions that may moderate the relationship

between the intervention and reading outcomes. In particular, researchers should

418 R. GERSTEN ET AL.

continue to investigate variables that could provide us with possible explanations for the

null and negative impacts from the Balu et al. (2015) study.

One finding worth exploring further is whether the interventionist moderates student

impacts on measures of reading achievement. Interventions implemented by certified

teachers did not yield significantly higher effect sizes (p¼.053) than those conducted by

others (primarily paraeducators, university students working for a researcher). Yet,

results of the Balu et al. (2015) survey revealed that teachers provided intervention in

over a third of schools implementing RtI in Grades 1–3. Similarly, the effect sizes for

interventions delivered by paraprofessionals did not differ significantly from those deliv-

ered by certified teachers or researchers (p¼.086) in our analyses. These results conflict

with those reached by Slavin et al. (2011) in an earlier review of literature on reading

interventions, which suggests this is an area that warrants further investigation.

The results from the Balu et al. (2015) study also suggest that schools implementing

RtI often used small groups ranging from 2 to 10 students, as opposed to the interven-

tions in the meta-analysis, which were implemented in smaller groups (2 to 5 students)

or one-on-one. We found an average effect size of 0.46 for interventions that were deliv-

ered one-on-one and 0.31 for those delivered to small groups of students; however, this

moderator variable was not statistically significant (p¼.075). Further analyses revealed

that grouping moderated effects for Grade 1 but not for Grades 2 and 3 (p¼.042).

One-on-one instruction may be more beneficial for beginning readers. Similarly, even in

small groups of 2–5 students, it may be easier to meet students’needs if all the students

in the group are similar in their basic knowledge of rhyming, alphabet, phonemes, and

decoding skills. A recent study by Al Otaiba, Connor, et al. (2014) supports this notion.

They found that it was necessary to make small groups more homogeneous by adjusting

both the text’s readability level and the lesson pacing to meet students’individual needs.

It could also be that the schools in the Balu et al. (2015) study used more scripted

interventions, though we cannot know for sure since the Balu survey did not ask

whether scripted interventions were used. Previous research has indicated that programs

where teachers are given some autonomy tend to produce higher results in reading

comprehension (Fang, Fu, & Lamme, 2004; Tivnan & Hemphill, 2005; Wilson, Martens,

& Arya, 2005). One reason for this might be that scripted interventions leave little room

for even slight adaptations to meet individual student needs when compared to those

with a lesson plan and no exact wording. Our results, however, found the effect sizes

for research interventions that allowed teachers to adapt the intervention to students’

needs did not differ significantly from scripted interventions (p¼.094). Future research

should explore this area.

Before overgeneralizing from these findings, it is important to note that other var-

iables may be confounding the relationship. For example, all but 3 of the 15 scripted

interventions were implemented by paraprofessionals. Typically, paraprofessionals

implement scripted programs because most do not have the training to make appro-

priate instructional decisions when using a traditional lesson plan. Because our lim-

ited number of studies hindered our ability to model all the potential moderators at

once (i.e., controlling for other variables), the moderator findings should be inter-

preted with caution.

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 419

Relation to Previous Relevant Meta-Analysis

It is difficult to draw a direct comparison between the current study and the Slavin

et al. (2011) and Wanzek et al. (2016,2018) meta-analyses. Though the studies included

in this meta-analysis overlap some of the studies included in the other meta-analyses,

this meta-analysis is the first to use rigorous standards of evidence in the inclusion cri-

teria. The other meta-analyses included studies that were not as rigorous as those in the

current study and included kindergarten interventions, which typically focus heavily on

reading-related skills such as phonological awareness, rhyming, and basic decoding.

The impacts in the current meta-analysis (0.39) are smaller than several impacts in

the Wanzek et al. (2016) meta-analysis of studies of shorter interventions: 0.54 on stand-

ardized foundational skill measures, 0.62 for non-standardized foundational skill meas-

ures, and 1.02 for non-standardized multicomponent measures. The differences in the

magnitude of effect sizes may be due to studies that were not as rigorous as those in the

current study or to the inclusion of kindergarten interventions. However, in the Wanzek

et al. (2016) study, domain level impacts were reported for composite domains—founda-

tional reading/reading-related skills (including phonological awareness, rhyming, letter

identification, as well as measures of decoding; 0.54 to 0.62)—and multicomponent

measures (a composite of listening and reading comprehension; 0.36 to 1.02), which

makes it difficult to compare against our domain-level impacts, which ranged from 0.31

to 0.41.

Yet, effects in the Wanzek et al. (2018) meta-analysis of studies of longer interven-

tions, and the impacts of one-on-one and small group interventions in Slavin et al.

(2011), and the impacts on standardized multicomponent measures in Wanzek et al.

(2016) are similar to those found in our analysis. These findings suggest consistency in

the impact of reading interventions for struggling readers.

Challenges and Limitations in Conducting the Meta-Analysis

Issues in Using Rigorous Design Standards and Contemporary Meta-Analytic Techniques

A unique feature of this meta-analysis is that it included only those studies that met

what is often called the gold standard, What Works Clearinghouse (WWC 3.0) standards

for RCTs and quasi-experimental design. Ninety-one percent of the studies included in

this meta-analysis were RCTs (k¼30), which is a much higher proportion of RCTs

than in similar previous meta-analyses (e.g., Wanzek et al., 2016 [55% RCTs]; Swanson,

1999 [47.9% RCTs]). Including only those studies that met these rigorous standards

allows for more confidence in the meta-analytic findings. This is an especially important

contemporary issue given the lack of replicability of findings in the social sciences

(Ioannidis, 2005) and the general concern about false positives in both individual

research studies (Benjamini & Hochberg, 1995) and meta-analyses (Greco, Zangrillo,

Biondi-Zoccai, & Landoni, 2013).

The gain in trustworthiness, however, resulted in less statistical power for analyses,

including the crucial moderator analyses, which help in understanding possible underly-

ing themes in the data, as invariably, fewer studies met the rigorous design standards in

this meta-analysis. This is likely to become an issue in future meta-analyses, as the

420 R. GERSTEN ET AL.

tradeoff between the quality and validity of the research findings conflicts with the need

for a large number of studies in conducting important moderator analyses with suffi-

cient power. We suspect it will take some time for the field to produce enough high-

quality studies to result in statistically significant findings from which we could draw

conclusions across studies.

As studies most often contain more than one outcome measure and at times more

than one comparison, meta-analyses must address the dependencies arising from such

multiple outcomes and comparisons within studies. This issue was pertinent for our

meta-analysis, as 90% of the studies included multiple measures and 15% contained

multiple comparisons. Thus, to account for the dependencies in the data, we used the

Robust Variance Estimation (RVE; Hedges et al., 2010), a contemporary statistical tech-

nique. One problem in using the RVE is that it results in low statistical power rates

unless the meta-analysis includes a large number of studies (L!

opez-L!

opez, Van den

Noortgate, Tanner-Smith, Wilson, & Lipsey, 2017). Tipton (2015) notes that at least 40

studies are needed for adequate statistical power to conduct the moderator analyses.

Our meta-analysis included 33 high-quality experimental and quasi-experimental studies

(meeting WWC standards), a number not typically seen for a topic as specific as this in

other areas of educational research. However, it still fell below the minimum number of

40 studies for adequate statistical power to conduct the moderator analyses.

A meta-regression model with all moderators entered simultaneously (e.g., Gersten,

Chard, et al., 2009; Wanzek et al., 2016) would have been preferable to our series of

analyses, which tested each moderator individually. However, the overall number of

studies that met the inclusion and study quality criteria was small, and not every study

included information that permitted coding of all moderators. Thus, analyzing all varia-

bles within one regression model was not feasible, as results would be

uninterpretable due to insufficient degrees of freedom. Therefore, the single-predictor

RVE meta-regression models used in the meta-analysis must be interpreted with caution

due to the potential confounding effects of other moderators that are not accounted for

in these models.

Issues in Coding Studies

Only a few studies (Denton et al., 2014; Vadasy, Sanders, & Tudor, 2007) provided a

rich description on the nature of instruction in the intervention. Most articles did not

provide sufficient detail on how reading is taught; instead, they merely listed the areas

of instruction that were covered, provided a brief cursory explanation, or addressed

them in a figure or table with little sense of the amount of time devoted to activities or

what they actually entailed. Because written descriptions were often not detailed enough,

coding or classifying these areas was at times guesswork.

Given the difficulty we had with coding the instructional focus categories, we recom-

mend future intervention research articles to include more detailed descriptions of each

component of the intervention. However, this may be easier said than done as journal

article submission usually entails strict space allocation. Understanding this, we would

encourage authors to write detailed descriptions with sample lessons, place them in a

website noted in the article, and provide access to the information. That would

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 421

allow those involved in research syntheses and those interested in replications to

access this material and ultimately have a better understanding of the nature of

the research.

Coding the at-risk category was a challenging task, as out of the 33 studies, only 16

could be used for the analysis examining the moderating role of the at-risk status variable.

This is because, across the 33 studies, there was little commonality in how at-risk status

was operationally defined (i.e., described as below grade-level performance; based on local

norms, national norms, researcher-developed measures, or validated screening measures),

making comparisons across studies difficult and underpowered. We also found the use of

norms from standardized tests to be problematic because some of the norms were much

older than others, and the field of early literacy instruction has undergone massive changes

in the past 15years. In addition, there are likely to be shifts in national norms on some

measures, especially those involving phonological awareness, phonics, and possibly oral

reading fluency. It would be helpful if the field could adopt more consistent means of

determining the suitable samples for a Tier 2 reading intervention.

Fidelity of implementation was another area that was difficult to code due to the lack

of consistency across studies in how fidelity was explained and measured. For instance,

if different measurement systems are used, 80% fidelity in one study is not comparable

to 80% fidelity in another study. As a result, though this was very much an area of

interest for us, we could not code for fidelity as a moderator.

Implications for Future Research

Most intervention studies examine impacts immediately at the end of an intervention.

An important next step in reading intervention research, one only occasionally

attempted to date (e.g., Al Otaiba, Kim, Wanzek, Petscher, & Wagner, 2014; Blachman

et al., 2014; Vaughn et al., 2008), is to see whether the impacts on reading performance

are maintained, both with and without further intervention, in follow-up studies.

Additional intervention research is also needed in the area of vocabulary. Few studies

in our meta-analysis addressed reading vocabulary in a comprehensive manner during

the intervention, and only two studies (Gunn, Smolkowski, Biglan, Black, & Blair, 2005;

O’Connor, Swanson, & Geraghty, 2010) included vocabulary as an outcome measure.

We were therefore unable to draw conclusions on this crucial aspect of reading profi-

ciency. Future intervention research, especially in Grades 2 and 3, should include a sys-

tematic vocabulary instruction component in the interventions and assess its

effectiveness using reading vocabulary outcomes.

We would also encourage more intervention research in the areas of reading and lan-

guage comprehension, since these were areas of weaker impacts. Newer intervention

research (e.g., Foorman, Herrera, & Dombek, 2018) increasingly includes both reading

and listening comprehension, and these may lead to stronger impacts in the reading

comprehension domain.

Acknowledgments

The authors wish to acknowledge the sage advice provided by Nancy Lewis and Terri Pigott, and

recognize Samantha Spallone, Pam Foremski, and Christopher Tran for their assistance.

422 R. GERSTEN ET AL.

Funding
This research was supported in part by Contract Number [ED-IES-12-C-0011]. The views do not
represent those of the U.S. Department of Education.
References
Al Otaiba, S., Connor, C. M., Folsom, J. S., Wanzek, J., Greulich, L., Schatschneider, C., &
Wagner, R. K. (2014). To wait in Tier 1 or intervene immediately: A randomized experiment
examining first-grade response to intervention in reading. Exceptional Children,81(1), 11–27.
doi:10.1177/0014402914532234
Al Otaiba, S., & Fuchs, D. (2002). Characteristics of children who are unresponsive to early liter-
acy intervention: A review of the literature. Remedial and Special Education,23(5), 300–316.
doi:10.1177/07419325020230050501
Al Otaiba, S., Kim, Y. S., Wanzek, J., Petscher, Y., & Wagner, R. K. (2014). Long-term effects of
first-grade multitier intervention. Journal of Research on Educational Effectiveness,7(3),
250–267. doi:10.1080/19345747.2014.906692
Allor, J., & McCathren, R. (2004). The efficacy of an early literacy tutoring program implemented
by college students. Learning Disabilities Research and Practice,19(2), 116–129. doi:10.1111/j.
1540-5826.2004.00095.x
Balu, R., Zhu, P., Doolittle, F., Schiller, E., Jenkins, J., & Gersten, R. (2015). Evaluation of response
to intervention practices for elementary school reading (NCEE 2016-4000). Washington, DC:
National Center for Education Evaluation and Regional Assistance, Institute of Education
Sciences, U.S. Department of Education. Retrieved from http://ies.ed.gov/ncee/pubs/20164000/
pdf/20164000.pdf
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and power-
ful approach to multiple testing. Journal of the Royal Statistical Society: Series B
(Methodological),57(1), 289–300. doi:10.1111/j.2517-6161.1995.tb02031.x
Berninger, V. W., Abbott, R. D., Vermeulen, K., & Fulton, C. M. (2006). Paths to reading com-
prehension in at-risk second-grade readers. Journal of Learning Disabilities,39(4), 334–351.
doi:10.1177/00222194060390040701
Blachman, B. A., Schatschneider, C., Fletcher, J. M., Francis, D. J., Clonan, S. M., Shaywitz, B. A.,
& Shaywitz, S. E. (2004). Effects of intensive reading remediation for second and third graders
and a 1-year follow-up. Journal of Educational Psychology,96 (3), 444–461. doi:10.1037/0022-
0663.96.3.444
Blachman, B. A., Schatschneider, C., Fletcher, J. M., Murray, M. S., Munger, K. A., & Vaughn,
M. G. (2014). Intensive reading remediation in grade 2 or 3: Are there effects a decade later?
Journal of Educational Psychology,106 (1), 46–57. doi:10.1037/a0033663
Case, L. P., Speece, D. L., Silverman, R., Ritchey, K. D., Schatschneider, C., Cooper, D. H., …
Jacobs, D. (2010). Validation of a supplemental reading intervention for first-grade children.
Journal of Learning Disabilities,43(5), 402–417. doi:10.1177/0022219409355475
Case, L., Speece, D., Silverman, R., Schatschneider, C., Montanaro, E., & Ritchey, K. (2014).
Immediate and long-term effects of tier 2 reading instruction for first-grade students with a
high probability of reading failure. Journal of Research on Educational Effectiveness,7(1),
28–53. doi:10.1080/19345747.2013.786771
Center for Research and Reform in Education & Johns Hopkins University. (2019). Evidence for
ESSA: Standards and procedures. Retrieved from https://content.evidenceforessa.org/sites/
default/files/On%20clean%20Word%20doc.pdf
Denton, C. A., Fletcher, J. M., Taylor, W. P., Barth, A. E., & Vaughn, S. (2014). An experimental
evaluation of guided reading and explicit interventions for primary-grade students at-risk for
reading difficulties. Journal of Research on Educational Effectiveness,7(3), 268–293. doi:10.1080/
19345747.2014.906010
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 423

Denton, C. A., Nimon, K., Mathes, P. G., Swanson, E. A., Kethley, C., Kurz, T. B., & Shih, M.
(2010). Effectiveness of a supplemental early reading intervention scaled up in multiple schools.
Exceptional Children,76 (4), 394–416. doi:10.1177/001440291007600402
Denton, C. A., Tolar, T. D., Fletcher, J. M., Barth, A. E., Vaughn, S., & Francis, D. J. (2013).
Effects of tier 3 intervention for students with persistent reading difficulties and characteristics
of inadequate responders. Journal of Educational Psychology,105(3), 633–648. doi:10.1037/
a0032581
Duval, S., & Tweedie, R. (2000). Trim and fill: A simple funnel-plot-based method of testing and
adjusting for publication bias in meta-analysis. Biometrics,56 (2), 455–463. doi:10.1111/j.0006-
341X.2000.00455.x
Elleman, A. M., Lindo, E. J., Morphy, P., & Compton, D. L. (2009). The impact of vocabulary
instruction on passage-level comprehension of school-age children: A meta-analysis. Journal of
Research on Educational Effectiveness,2(1), 1–44. doi:10.1080/19345740802539200
Every Student Succeeds Act of 2015, Pub. L. No. 114–95, § 8101(21)(A), 129 Stat.1939 (2015).
Fang, Z., Fu, D., & Lamme, L. L. (2004). From scripted instruction to teacher empowerment:
Supporting literacy teachers to make pedagogical transitions. Literacy (Formerly Reading),
38(1), 58–64. doi:10.1111/j.0034-0472.2004.03801010.x
Fien, H., Smith, J. L. M., Smolkowski, K., Baker, S. K., Nelson, N. J., & Chaparro, E. (2015). An
examination of the efficacy of a multitiered intervention on early reading outcomes for first
grade students at risk for reading difficulties. Journal of Learning Disabilities,48(6), 602–621.
doi:10.1177/0022219414521664
Foorman, B., Beyler, N., Borradaile, K., Coyne, M., Denton, C. A., Dimino, J., …Wissel, S.
(2016). Foundational skills to support reading for understanding in kindergarten through 3rd
grade (NCEE 2016-4008). Washington, DC: National Center for Education Evaluation and
Regional Assistance (NCEE), Institute of Education Sciences, U.S. Department of Education.
Retrieved from https://ies.ed.gov/ncee/wwc/practiceguide/21
Foorman, B. R., Herrera, S., & Dombek, J. (2018). The relative impact of aligning Tier 2 interven-
tion materials with classroom core reading materials in grades K–2. The Elementary School
Journal,118(3), 477–504. doi:10.1086/696021
Francis, D. J., Kulesz, P. A., & Benoit, J. S. (2018). Extending the simple view of reading to
account for variation within readers and across texts: The complete view of reading (CVR i).
Remedial and Special Education,39(5), 274–288. doi:10.1177/0741932518772904
Fuchs, D., Compton, D. L., Fuchs, L. S., Bryant, J., & Davis, G. N. (2008). Making “secondary
intervention”work in a three-tier responsiveness-to-intervention model: Findings from the
first-grade longitudinal reading study of the National Research Center on Learning Disabilities.
Reading and Writing,21(4), 413–436. doi:10.1007/s11145-007-9083-9
Fuchs, D., & Fuchs, L. S. (2017). Critique of the national evaluation of response to intervention:
A case for simpler frameworks. Exceptional Children,83(3), 255–268. doi:10.1177/
0014402917693580
Fuchs, L. S., Fuchs, D., Powell, S. R., Seethaler, P. M., Cirino, P. T., & Fletcher, J. M. (2008).
Intensive intervention for students with math disabilities: Seven principles of effective practice.
Learning Disability Quarterly,31(2), 79–92. doi:10.2307/20528819
Gamse, B. C., Jacob, R. T., Horst, M., Boulay, B., & Unlu, F. (2008). Reading first impact study
final report (NCEE 2009-4038). Washington, DC: National Center for Education Evaluation
and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.
Gersten, R., Chard, D., Jayanthi, M., Baker, S., Morphy, P., & Flojo, J. (2009). Mathematics
instruction for students with learning disabilities: A meta-analysis of instructional components.
Review of Educational Research,79(3), 1202–1242. doi:10.3102/0034654309334431
Gersten, R., Compton, D., Connor, C. M., Dimino, J., Santoro, L., Linan-Thompson, S., & Tilly,
W. D. (2009). Assisting students struggling with reading: Response to Intervention and multi-tier
intervention for reading in the primary grades. A practice guide (NCEE 2009-4045).
Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute
of Education Sciences, U.S. Department of Education. Retrieved from http://ies.ed.gov/ncee/
wwc/pdf/practice_guides/rti_reading_pg_021809.pdf
424 R. GERSTEN ET AL.

Gersten, R., Jayanthi, M., & Dimino, J. (2017). Too much, too soon? A commentary on what the
national RtI evaluation left unanswered and what reading intervention research tells us.
Exceptional Children,83(3), 244–254. doi:10.1177/0014402917692847
Greco, T., Zangrillo, A., Biondi-Zoccai, G., & Landoni, G. (2013). Meta-analysis: Pitfalls and
hints. Heart, Lung and Vessels,5(4), 219–225.
Gunn, B., Smolkowski, K., Biglan, A., Black, C., & Blair, J. (2005). Fostering the development of
reading skill through supplemental instruction results for Hispanic and non-Hispanic students.
The Journal of Special Education,39(2), 66–85. doi:10.1177/00224669050390020301
Hedberg, E. C. (2011). ROBUMETA: Stata module to perform robust variance estimation in meta-
regression with dependent effect size estimates. Boston, MA: Boston College.
Hedges, L. V. (1981). Distribution theory for Glass’s estimator of effect size and related estima-
tors. Journal of Educational Statistics,6(2), 107–128. doi:10.3102/10769986006002107
Hedges, L. V., Tipton, E., & Johnson, M. C. (2010). Robust variance estimation in meta-regres-
sion with dependent effect size estimates. Research Synthesis Methods,1(1), 39–65. doi:10.1002/
jrsm.5
Imbens, G. W., & Lemieux, T. (2008). Regression discontinuity designs: A guide to practice.
Journal of Econometrics,142(2), 615–635. doi:10.1016/j.jeconom.2007.05.001
Individuals with Disabilities Education Act, Pub. L. No. 108-446, 20 U.S.C. § 1400, 118 Stat. 2649 (2004).
Ioannidis, J. (2005). Why most published research findings are false. PLoS Medicine,2(8), e124.
doi:10.1371/journal.pmed.0020124
Jacob, R., Armstrong, C., Bowden, A. B., & Pan, Y. (2016). Leveraging volunteers: An experimen-
tal evaluation of a tutoring program for struggling readers. Journal of Research on Educational
Effectiveness,9(Supp 1), 67–92. doi:10.1080/19345747.2016.1138560
Jenkins, J. R., Peyton, J. A., Sanders, E. A., & Vadasy, P. F. (2004). Effects of reading decodable
texts in supplemental first-grade tutoring. Scientific Studies of Reading,8(1), 53–85. doi:10.
1207/s1532799xssr0801_4
Lane, H. B., Pullen, P. C., Hudson, R. F., & Konold, T. R. (2009). Identifying essential instruc-
tional components of literacy tutoring for struggling beginning readers. Literacy Research and
Instruction,48(4), 277–297. doi:10.1080/19388070902875173
L!
opez-L!
opez, J. A., Van den Noortgate, W., Tanner-Smith, E. E., Wilson, S. J., & Lipsey, M. W.
(2017). Assessing meta-regression methods for examining moderator relationships with
dependent effect sizes: A Monte Carlo simulation. Research Synthesis Methods,8(4), 435–450.
doi:10.1002/jrsm.1245
May, H., Sirinides, P., Gray, A., & Goldsworthy, H. (2016). Reading recovery: An evaluation of the
four-year i3 scale-up. Philadelphia, PA: Consortium for Policy Research in Education,
University of Pennsylvania.
National Institute of Child Health and Human Development [NICHD]. (2000). Report of the
National Reading Panel. Teaching children to read: Reports of the subgroups (NIH Publication
No. 00-4754). Washington, DC: U.S. Department of Health and Human Services. Retrieved
from https://www.nichd.nih.gov/sites/default/files/publications/pubs/nrp/Documents/report.pdf
No Child Left Behind Act of 2001 [NCLB], Pub. L. No. 107-110, § 1201, 115 Stat. 1425 (2002).
O’Connor, R. E., Swanson, H. L., & Geraghty, C. (2010). Improvement in reading rate under
independent and difficult text levels: Influences on word and comprehension skills. Journal of
Educational Psychology,102(1), 1–19. doi:10.1037/a0017488
Pullen, P. C., Lane, H. B., & Monaghan, M. C. (2004). Effects of a volunteer tutoring model on
the early literacy development of struggling first grade students. Reading Research and
Instruction,43(4), 21–40. doi:10.1080/19388070409558415
Scanlon, D. M., Vellutino, F. R., Small, S. G., Fanuele, D. P., & Sweeney, J. M. (2005). Severe
reading difficulties–Can they be prevented? A comparison of prevention and intervention
approaches. Exceptionality,13(4), 209–227. doi:10.1207/s15327035ex1304_3
Schwartz, R. M. (2005). Literacy learning of at-risk first-grade students in the reading recovery
early intervention. Journal of Educational Psychology,97(2), 257–267. doi:10.1037/0022-0663.97.
2.257
READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 425

Slavin, R. E., Lake, C., Davis, S., & Madden, N. A. (2011). Effective programs for struggling read-
ers: A best-evidence synthesis. Educational Research Review,6(1), 1–26. doi:10.1016/j.edurev.
2010.07.002
Smith, J. L. M., Nelson, N. J., Fien, H., Smolkowski, K., Kosty, D., & Baker, S. K. (2016).
Examining the efficacy of a multitiered intervention for at-risk readers in grade 1. The
Elementary School Journal,116 (4), 549–573. doi:10.1086/686249
StataCorp. (2015). Stata statistical software (Release 14). College Station, TX: StataCorp LP.
Stuebing, K. K., Barth, A. E., Trahan, L. H., Reddy, R. R., Miciak, J., & Fletcher, J. M. (2015). Are
child cognitive characteristics strong predictors of responses to intervention? A meta-analysis.
Review of Educational Research,85(3), 395–429. doi:10.3102/0034654314555996
Swanson, H. L. (1999). Reading research for students with LD: A meta-analysis of intervention
outcomes. Journal of Learning Disabilities,32(6), 504–532. doi:10.1177/002221949903200605
Tanner-Smith, E. E., & Tipton, E. (2014). Robust variance estimation with dependent effect sizes:
Practical considerations including a software tutorial in Stata and SPSS. Research Synthesis
Methods,5(1), 13–30. doi:10.1002/jrsm.1091
Tipton, E. (2015). Small sample adjustments for robust variance estimation with meta-regression.
Psychological Methods,20(3), 375–393. doi:10.1037/met0000011
Tivnan, T., & Hemphill, L. (2005). Comparing four literacy reform models in high-poverty
schools: Patterns of first-grade achievement. The Elementary School Journal,105(5), 419–441.
doi:10.1086/431885
Tran, L., Sanchez, T., Arellano, B., & Swanson, H. L. (2011). A meta-analysis of the RTI literature
for children at risk for reading disabilities. Journal of Learning Disabilities,44(3), 283–295. doi:
10.1177/0022219410378447
U.S. Department of Education [U.S. ED], Institute of Education Sciences [IES], & What Works
Clearinghouse [WWC]. (2013). What Works Clearinghouse: Procedures and standards handbook
(Version 3.0). Retrieved from https://ies.ed.gov/ncee/wwc/Docs/referenceresources/wwc_proce-
dures_v3_0_standards_handbook.pdf
Vadasy, P. F., & Sanders, E. A. (2011). Efficacy of supplemental phonics-based instruction for
low-skilled first graders: How language minority status and pretest characteristics moderate
treatment response. Scientific Studies of Reading,15(6), 471–497. doi:10.1080/10888438.2010.
501091
Vadasy, P. F., Sanders, E. A., & Peyton, J. A. (2006). Paraeducator-supplemented instruction in
structural analysis with text reading practice for second and third graders at risk for reading
problems. Remedial and Special Education,27(6), 365–378. doi:10.1177/07419325060270060601
Vadasy, P. F., Sanders, E. A., & Tudor, S. (2007). Effectiveness of paraeducator-supplemented
individual instruction: Beyond basic decoding skills. Journal of Learning Disabilities,40(6),
508–525. doi:10.1177/00222194070400060301
Vaughn, S., Cirino, P. T., Tolar, T., Fletcher, J. M., Cardenas-Hagan, E., Carlson, C. D., &
Francis, D. J. (2008). Long-term follow-up of Spanish and English interventions for first-grade
English language learners at risk for reading problems. Journal of Research on Educational
Effectiveness,1(3), 179–214. doi:10.1080/19345740802114749
Vellutino, F. R., & Scanlon, D. M. (2002). The Interactive Strategies approach to reading interven-
tion. Contemporary Educational Psychology,27(4), 573–635. doi:10.1016/S0361-476X(02)00002-4
Wang, C., & Algozzine, B. (2008). Effects of targeted intervention on early literacy skills of at-risk
students. Journal of Research in Childhood Education,22(4), 425–439. doi:10.1080/
02568540809594637
Wanzek, J., Stevens, E. A., Williams, K. J., Scammacca, N., Vaughn, S., & Sargent, K. (2018).
Current evidence on the effects of intensive early reading interventions. Journal of Learning
Disabilities,51(6), 612–624. doi:10.1177/0022219418775110
Wanzek, J., & Vaughn, S. (2007). Research-based implications from extensive early reading inter-
ventions. School Psychology Review,36 (4), 541–561.
Wanzek, J., & Vaughn, S. (2008). Response to varying amounts of time in reading intervention
for students with low response to intervention. Journal of Learning Disabilities,41(2), 126–142.
doi:10.1177/0022219407313426
426 R. GERSTEN ET AL.

Wanzek, J., Vaughn, S., Scammacca, N., Gatlin, B., Walker, M. A., & Capin, P. (2016). Meta-anal-

yses of the effects of tier 2 type reading interventions in grades K-3. Educational Psychology

Review,28(3), 551–576. doi:10.1007/s10648-015-9321-7

Wilson, P., Martens, P., & Arya, P. (2005). Accountability for reading and readers: What the

numbers don’t tell. The Reading Teacher,58(7), 622–631. doi:10.1598/RT.58.7.3

READING INTERVENTIONS FOR STUDENTS IN THE PRIMARY GRADES 427

Effects of interventions using graphic organizer in Korea: a meta-analysis

Article

Full-text available

May 2024

This study aimed to evaluate the effects of intervention using graphic organizers on the cognitive and affective improvement of students with intellectual disability (ID), with learning disability (LD), without disability, and at-risk learners in Korea. A total of 49 peer-reviewed journals and dissertations for the last 20 years were included for conducting this meta-analysis. The overall effect size of intervention using a graphic organizer was .78 (d) (95% CI [.63, .94], \({\tau }^{2}\) = .28) using a random-effects model. In order of strongest to weakest effects, at-risk learners (d = 1.38), students with LD (d = 1.15), students with ID (d = .76), and students without disability (d = .52). Among student variables, there is no statistically significant difference by school level, but by school type. Among intervention variables, instruction in math (d = 1.43) and Korean (d = .96); cognitive mapping (d = 1.05); 1–19 times, 1–9 weeks, 3–5 times per week; and small-size groups were the most effective intervention conditions. While several variables showed significant subgroup differences, meta-regression analyses revealed that only group size and frequency were significant moderators after controlling for other factors. In summary, intervention using a graphic organizer was more effective for students with disability and at-risk students than it was for students without disability.

Comparing Tier 1 reading instruction with Tier 3 or special education intervention through an observational snapshot of school-implemented response to intervention across Grades 1–5

Article

Full-text available

May 2024
READ WRIT

There is limited research about Tier 3 interventions provided during typical school Response to Intervention (RTI) implementation. As part of a larger RTI exploration study designed to focus on students with the most intensive reading needs, our goal was to contrast their Tier 1 core reading instruction with their Tier 3 intervention. Schools identified participating students as receiving Tier 3 or special education for reading. We aimed to provide a snapshot of differences in Tier 1 and intensive interventions delivered to 264 students from 32 elementary schools. Our research team used the Instructional Content Emphasis-Revised (ICE-R; Edmond & Briggs, 2003) to observe differences in group size and the amounts and types of curricular content (e.g., code vs. meaning-focused) students received during both tiers. We explored whether these patterns were consistent across Grades 1–5 and if they differed in relation to students’ disability (reading disability vs. other disability vs. no disability). We also examined whether the Tier 3 observation data was consistent with administrators’ reports about RTI implementation within their school. Across the grades, we found significantly more small-group than whole-class instruction during Tier 3 than in Tier 1. We observed significantly lower proportions of code-focused instruction than meaning-focused instruction, particularly during Tier 1. Generally, code-focused instruction decreased across the grades in both tiers. Although we found a trend suggesting students with reading disabilities may have received higher proportions of code-focused instruction than other students, this was not significant after multiple comparisons. We discuss some similarities and differences between our observations and administrator data. We discuss implications, limitations, and directions for future research.

Reading Literacy of Elementary School Students Based upon PIRLS FrameworkHow Does the PIRLS Framework Assess Reading Literacy Among Elementary School Students in East Lombok?

Article

Full-text available

Apr 2024

This study aims to measure the reading literacy level of grade IV elementary school students (abbreviated as SD) in East Lombok with reading literacy parameters PIRLS 2011. The study aims at three aspects of reading: reading content comprehension, reading speed, and reading effectiveness. This study is expected to provide information about the position of literacy competence of grade IV elementary school students in East Lombok based upon PIRLS parameters and the formulation of recommendations for improving the reading literacy competence of elementary school students in East Lombok. Data was collected through reading tests, observations, and interviews. The study results show that the ability to understand the reading content of the study subjects of grade IV elementary school students in East Lombok was included in the low category, 0.51%, when compared to IEA countries between 15%, 19% and 24%. Reading speed is moderate, 151.1 words/minute compared to 130-180 words/minute on the national average. Reading effectiveness was low, at 34.9%, compared to 60-80 subjects in IEA member countries based on PIRLS standards. The ability to read in aspects of reading content comprehension at a low level, reading speed at a moderate level, and reading effectiveness at a low level is caused by external elements; the understanding of teachers, parents, society and government that reading is the ability to string words in a broader grammar. Internally, students are encouraged to read fluently to obtain factual information and the number of words read in a certain period.

The use of decodable texts in the teaching of reading in children without reading disabilities: a meta-analysis

Article

Full-text available

May 2024
Literacy

Dennis Murphy Odo

The purpose of this meta-analysis is to synthesise the research evidence on the use of decodable texts in the teaching of word reading and pseudoword decoding to determine their effectiveness in facilitating the development of reading skills in children without reading disabilities. A total of 821 articles were identified in the initial search. The search resulted in 16 articles that met the inclusion criteria and were included in the meta-analyses. The results of the risk of bias assessment revealed that the majority of the studies had a moderate to serious bias. The average standardised mean difference for word reading was small g = 0.20 and moderate g = 0.30 for pseudoword decoding. This finding highlights how using decodable texts can facilitate word reading and decoding to some degree, but they need to be used in combination with other reading instructional materials.

IDECOL: eficacia de una intervención con niños en riesgo de dislexia en un contexto profesional mediante un diseño experimental de caso único

Article

Apr 2024

Pese a que la investigación ha demostrado que es posible mejorar la decodificación de los niños con dislexia, pocos estudios han dado cuenta de ello en entornos reales en los que los recursos son limitados. En este estudió se evaluó, mediante el desarrollo de un diseño experimental de caso único con línea base múltiple, la aplicación del programa IDECOL en tres niños en riesgo de dislexia que mostraban dificultades severas para decodificar. La intervención abordaba los siguientes aspectos clave para la mejora de la decodificación: a) instrucción directa del principio alfabético, b) decodificación, c) conciencia fonémica d) ortografía frecuente y e) lecturas repetidas. Los niños recibieron una sesión de 55 minutos a la semana durante 12 semanas. Tras el diseño de una línea base en la que no se aplicó la intervención se fue monitorizando de manera repetida la mejora de los niños en la lectura. La mejora en la decodificación de los tres niños fue inmediata tras la aplicación de la primera sesión y el efecto encontrado fue amplio transcurridas las doce sesiones. Tras la retirada de la intervención dos de los tres niños sufrieron un retroceso o se estancaron en la mejora de sus habilidades de decodificación, lo que apoya que los efectos encontrados se debían a la aplicación del programa. Los resultados avalan que una intervención individualizada y ampliamente especializada aplicada en contextos reales puede ser efectiva con una exposición relativamente baja en niños pequeños que muestran dificultades relevantes para decodificar.

Tutoring: to hire or not to hire pro? What are the differences?

Article

Jun 2024

An Exploratory Study of Elementary School Students’ Reading Performance Scores before and after COVID-19

Article

Jan 2024

Conceptual Replications of the Teacher Study Group Approach to Professional Development in Vocabulary

Article

May 2024

Early cognitive predictors of spelling and reading in German-speaking children

Article

Full-text available

Apr 2024

Theoretical background While reading and spelling skills often are interconnected in models of literacy development, recent research suggests that the two skills can dissociate and that reading and spelling are associated with at least partly different cognitive predictors. However, previous research on dissociations between reading and spelling skills focused on children who have already mastered the first phases of literacy development. These findings suggest that dissociations are due to distinct deficits in orthographic processing (i.e., unprecise orthographic representations vs. inefficient serial processing). It is therefore unclear whether dissociations already become apparent during the initial stages, or rather emerge later in development. This study aims to enhance the understanding of the predictors of early spelling and reading skills, investigating potential variations, by considering various cognitive factors beyond well-established ones. Methods Data were collected at two time points: cognitive predictors and early reading and spelling skills were assessed at the end of kindergarten (T1) before formal literacy instruction started, and reading and spelling skills were again assessed in Grade 1 (T2). The data analysis included 353 first-grade participants. Linear regression analyses assessed predictive patterns, while logistic regression analyses explained children's likelihood of belonging to different proficiency groups (at-risk or typical skills). Results Results revealed phonological processing, letter knowledge, and intelligence, as significant predictors for Spelling in grade 1 (T2), even after adding the autoregressor (Spelling in kindergarten at T1) and the respective other literacy skill (Reading T2). For Reading in grade 1 (T2), phonological processing, and rapid automatized naming (RAN) surfaced as significant predictors after adding the autoregressor (Reading T1). However, only RAN surfaced as a significant predictor for Reading T2 after adding the respective other literacy skill (Spelling T2). In line with these findings, logistic regression analyses revealed that phonological processing predicted group allocation for Spelling T2 and RAN predicted group allocation for Reading T2. Conclusions Overall, the study underscores the importance of phonological processing and letter knowledge as early predictors of spelling and reading skills in Grade 1. Moreover, intelligence is identified as a predictor for early spelling, while rapid automatized naming (RAN) emerges as a predictor for early reading.

Intensifying Reading Interventions by Making Instruction More Explicit

Article

Apr 2024

Too Much, Too Soon? Unanswered Questions From National Response to Intervention Evaluation

Article

Full-text available

Apr 2017

The report of the national response to intervention (RTI) evaluation study, conducted during 2011–2012, was released in November 2015. Anyone who has read the lengthy report can attest to its complexity and the design used in the study. Both these factors can influence the interpretation of the results from this evaluation. In this commentary, we (a) explain what the national RTI evaluation examined and highlight the strengths and weaknesses of the design, (b) clarify the results of the evaluation and highlight some key implementation issues, (c) describe how rigorous efficacy trials on reading interventions can supplement several issues left unanswered by the national evaluation, and (d) discuss implications for future research and practice based on the findings of the national evaluation and reading intervention research.

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Article

Jan 1995

The common approach to the multiplicity problem calls for controlling the familywise error rate (FWER). This approach, though, has faults, and we point out a few. A different approach to problems of multiple significance testing is presented. It calls for controlling the expected proportion of falsely rejected hypotheses — the false discovery rate. This error rate is equivalent to the FWER when all hypotheses are true but is smaller otherwise. Therefore, in problems where the control of the false discovery rate rather than that of the FWER is desired, there is potential for a gain in power. A simple sequential Bonferronitype procedure is proved to control the false discovery rate for independent test statistics, and a simulation study shows that the gain in power is substantial. The use of the new procedure and the appropriateness of the criterion are illustrated with examples.

Extending the Simple View of Reading to Account for Variation Within Readers and Across Texts: The Complete View of Reading (CVR i )

Article

Sep 2018

This study leverages advances in multivariate cross-classified random effects models to extend the Simple View of Reading to account for variation within readers and across texts, allowing for both the personalization of the reading function and the integration of the component skills and text and discourse frameworks for reading research. We illustrate the Complete View of Reading (CVRi) using data from an intensive longitudinal design study with a large sample of typical (N = 648) and struggling readers (N = 865) in middle school and using oral reading fluency as a proxy for comprehension. To illustrate the utility of the CVRi, we present a model with cross-classified random intercepts for students and passages and random slopes for growth, Lexile difficulty, and expository text type at the student level. We highlight differences between typical and struggling readers and differences across students in different grades. The model illustrates that readers develop differently and approach the reading task differently, showing differential impact of text features on their fluency. To be complete, a model of reading must be able to reflect this heterogeneity at the person and passage level, and the CVRi is a step in that direction. Implications for reading interventions and 21st century reading research in the era of “Big Data” and interest in phenotypic characterization are discussed.

Current Evidence on the Effects of Intensive Early Reading Interventions

Article

May 2018

Many students at risk for or identified with reading disabilities need intensive reading interventions. This meta-analysis provides an update to the Wanzek and Vaughn synthesis on intensive early reading interventions. Effects from 25 reading intervention studies are analyzed to examine the overall effect of intensive early reading interventions as well as relationships between intervention and student characteristics related to outcomes. The weighted mean effect size (ES) estimate (ES = 0.39), with a mean effect size adjusted for publication bias (ES = 0.28), both significantly different from zero, suggested intensive early reading interventions resulted in positive outcomes for early struggling readers in kindergarten through third grades. There was no statistically significant or meaningful heterogeneity in the study-wise effect sizes. Exploratory examination of time in intervention, instructional group size, initial reading achievement, and date of publication are provided.

The Relative Impact of Aligning Tier 2 Intervention Materials with Classroom Core Reading Materials in Grades K–2

Article

Jan 2018

This randomized controlled trial in 55 low-performing schools across Florida compared 2 early literacy interventions—1 using stand-alone materials and 1 using materials embedded in the existing core reading/language arts program. A total of 3,447 students who were below the 30th percentile in vocabulary and reading-related skills participated in the study. Both interventions were implemented with fidelity for 45 minutes daily for 27 weeks in small groups of 4 students (or 5 in grade 2). The stand-alone intervention significantly improved grade 2 spelling outcomes relative to the embedded intervention; there were some differential impacts due to cohort and baseline and, in kindergarten, to English-learner status. On average, students in schools in both interventions showed similar improvement in reading and language outcomes and similar percentile gains to those in recent systematic reviews. Results are discussed with respect to alignment of Tier 2 instruction with Tier 1 instruction.

Critique of the National Evaluation of Response to Intervention: A Case for Simpler Frameworks

Article

Apr 2017

In 2010, the Institute of Education Sciences commissioned a much-needed national evaluation of response to intervention (RTI). The evaluators defined their task very narrowly, asking “Does the use of universal screening, including a cut-point for designating students for more intensive Tier 2 and Tier 3 interventions, increase children’s performance on a comprehensive reading measure?” Their regression-discontinuity analysis showed that first-grade children designated for (but not necessarily receiving) more intensive intervention in the 146 study schools performed significantly worse than children not designated for it. There were no reliable differences between designated and nondesignated students in Grades 2 or 3. The provocativeness of these findings notwithstanding, the evaluation’s focus and design weakens its importance. RTI implementation data were also collected in the 146 study schools. These data suggest many of them were not conducting RTI in a manner supported by research and policy. Such findings and others’ evaluations of RTI advance the idea that simpler frameworks may encourage more educators to implement RTI’s most important components with fidelity.

Assessing meta-regression methods for examining moderator relationships with dependent effect sizes: A Monte Carlo simulation

Article

May 2017

Dependent effect sizes are ubiquitous in meta-analysis. Using Monte Carlo simulation, we compared the performance of 2 methods for meta-regression with dependent effect sizes-robust variance estimation (RVE) and 3-level modeling-with the standard meta-analytic method for independent effect sizes. We further compared bias-reduced linearization and jackknife estimators as small-sample adjustments for RVE and Wald-type and likelihood ratio tests for 3-level models. The bias in the slope estimates, width of the confidence intervals around those estimates, and empirical type I error and statistical power rates of the hypothesis tests from these different methods were compared for mixed-effects meta-regression analysis with one moderator either at the study or at the effect size level. All methods yielded nearly unbiased slope estimates under most scenarios, but as expected, the standard method ignoring dependency provided inflated type I error rates when testing the significance of the moderators. Robust variance estimation methods yielded not only the best results in terms of type I error rate but also the widest confidence intervals and the lowest power rates, especially when using the jackknife adjustments. Three-level models showed a promising performance with a moderate to large number of studies, especially with the likelihood ratio test, and yielded narrower confidence intervals around the slope and higher power rates than those obtained with the RVE approach. All methods performed better when the moderator was at the effect size level, the number of studies was moderate to large, and the between-studies variance was small. Our results can help meta-analysts deal with dependency in their data.

Reading Recovery: An Evaluation of the Four-Year i3 Scale-Up

Article

Mar 2016

CPRE released its evaluation of one of the most ambitious and well-documented expansions of a U.S. instructional curriculum. The rigorous independent evaluation of the Investing in Innovation (i3) scale-up of Reading Recovery, a literacy intervention for struggling first graders, was a collaboration between CPRE and the Center for Research on Education and Social Policy (CRESP) at the University of Delaware. The CPRE/CRESP evaluation revealed that students who participated in Reading Recovery significantly outperformed students in the control group on measures of overall reading, reading comprehension, and decoding. These effects were similarly large for English language learners and students attending rural schools, which were the student subgroups of priority interest for the i3 scale-up grant program. The study included an in-depth analysis of program implementation. Key findings focus on the contextual factors of the school and teachers that support the program’s success and the components of instructional strength in Reading Recovery.

Examining the Efficacy of a Multitiered Intervention for At-Risk Readers in Grade 1

Article

May 2016

This study reports the results of a cluster RCT evaluating the impact of Enhanced Core Reading Instruction on reading achievement of grade 1 at-risk readers. Fortyfour elementary schools, blocked by district, were randomly assigned to condition. In both conditions, at-risk readers received 90 minutes of whole-group instruction (Tier 1) plus an additional 30 minutes of daily, smallgroup intervention (Tier 2). In the treatment condition, Tier 1 instruction included enhancements to the core program and Tier 2 intervention was highly aligned with the core program. In the comparison condition, Tier 1 instruction used the same core program as treatment schools in the district and Tier 2 intervention followed standard district protocol. Significant treatment effects were found on measures of phonemic decoding and oral reading fluency from fall to winter and word reading from fall to spring. Student-and classroom-level variables predicted student response to instruction differentially by condition.

Leveraging Volunteers: An Experimental Evaluation of a Tutoring Program for Struggling Readers

Article

Feb 2016

This study evaluates the impacts and costs of the Reading Partners program, which uses community volunteers to provide one-on-one tutoring to struggling readers in under-resourced elementary schools. The evaluation uses an experimental design. Students were randomly assigned within 19 different Reading Partners sites to a program or control condition to answer questions about the impact of the program on student reading proficiency. A cost study, using a subsample of six of the 19 study sites, explores the resources needed to implement the Reading Partners program as described in the evaluation. Findings indicate that the Reading Partners program has a positive and statistically significant impact on all three measures of reading proficiency assessed with an effect size equal to around 0.10. The cost study findings illustrate the potential value of the Reading Partners program from the schools' perspective because the financial and other resources required by the schools to implement the program are low. Additionally, the study serves as an example of how evaluations can rigorously examine both the impacts and costs of a program to provide evidence regarding effectiveness.

Meta-Analysis of the Impact of Reading Interventions for Students in the Primary Grades

Abstract and Figures

Recommended publications

A Meta‐Analysis of Interventions to Improve Reading Comprehension Outcomes for Adolescents with Read...

Supplemental reading interventions implemented by paraprofessionals: A meta‐analysis

Too Much, Too Soon? Unanswered Questions From National Response to Intervention Evaluation

Effects of Tier 3 Intervention for Students With Persistent Reading Difficulties and Characteristics...

Text Types and Their Relation to Efficacy in Beginning Reading Interventions