Nonexperimental Replications of Social Experiments: A Systematic Review

Intervention Research in Social Work: Recent Advances and Continuing Challenges

Article

May 2004
RES SOCIAL WORK PRAC

Mark W. Fraser

The purpose of this article is to review substantive and methodological advances in interventive research. Three sub- stantive advances are discussed: (a) the growing use of a risk factor perspective, (b) the emergence of practice-rele- vant microsocial theories, and (c) the increased acceptance of structured treatment protocols and manual. In addition, three methodological developments are discussed. They include new developments for dealing with attri- tion, for dealing with selection effects, and for decomposing complexities using text and numerical analyses. Arguing that intervention research holds the potential to unify research scholarship in social work, the conclusion discusses ongoing challenges associated with the implementation of new programs, variance in outcomes by method, reactivity to measurement, and construct validity in the context of culture.

PROTOCOL: Impacts of after-school programs on student outcomes

Article

Full-text available

Jan 2005

Sustainable healthy eating behaviour of young adults: Towards a novel methodological approach

Article

Full-text available

Jul 2016
BMC PUBLIC HEALTH

Background: Food, nutrition and health policy makers are poised with two pertinent issues more than any other: obesity and climate change. Consumer research has focused primarily on specific areas of sustainable food, such as organic food, local or traditional food, meat substitution and/or reduction. More holistic view of sustainable healthy eating behaviour has received less attention, albeit that more research is emerging in this area. Methods/design: This study protocol that aims to investigate young consumers' attitudes and behaviour towards sustainable and healthy eating by applying a multidisciplinary approach, taking into account economical, marketing, public health and environmental related issues. In order to achieve this goal, consumers' reactions on interactive tailored informational messages about sustainable from social, environmental and economical point of view, as well as healthy eating behaviour in a group of young adults will be investigated using randomized controlled trial. To undertake the objective, the empirical research is divided into three studies: 1) Qualitative longitudinal research to explore openness to adopting sustainable healthy eating behaviour; 2) Qualitative research with the objective to develop a sustainable healthy eating behaviour index; and 3) Randomised controlled trial to describe consumers' reactions on interactive tailored messages about sustainable healthy eating in young consumers. Discussion: To our knowledge, this is the first randomised controlled trial to test the young adults reactions to interactive tailor made messages on sustainable healthy eating using mobile smartphone app. Mobile applications designed to deliver intervention offer new possibilities to influence young adults behaviour in relation to diet and sustainability. Therefore, the study will provide valuable insights into drivers of change towards more environmentally sustainable and healthy eating behaviours. Trial registration: NCT02776410 registered May 16, 2016.

Technology-Enhanced Elementary and Middle School Science II (TEEMSS II) Research Report 1

Article

Full-text available

The District-Wide Effectiveness of the Achieve3000 Program: A Quasi-Experimental Study

Article

Full-text available

May 2023

Public Pensions and Income Transfer Between the Elderly and Their Children

Chapter

Mar 2023

Using data from the China Health and Retirement Longitudinal Survey of 2013 and 2015, this study investigates the effect of enrollment in public pensions and the amount of public pension benefits on the income transfer between the elderly and their children. The three conclusions are as follows. First, in general, there is a flattening and then rising trend in the relationship between enrollments in pensions and net transfer income; the effect of the amount of pension benefits on net transfer income is negative but not significant. Second, the amount of pension benefit of the New Rural Social Pension Insurance does not significantly affect the net transfer income, transfer income from children, or transfer income to children, while the pension benefit of the Employees’ Basic Pension Insurance has a significant positive effect on the transfer income to children. Third, the effects of pensions differ by the heterogeneous group. The need for high pension income is greater for the disadvantaged group, such as older women, single elderly, co-residence elderly, elderly with chronic diseases or disabilities, and elderly in the rural central and eastern regions.KeywordsPension benefitNet transfer incomeTransfer income to childrenTransfer income from childrenNew Rural Social Pension Insurance (NRSPI)

Using Trial and Observational Data to Assess Effectiveness: Trial Emulation, Transportability, Benchmarking, and Joint Analysis

Article

Feb 2023
EPIDEMIOL REV

Comparisons between randomized trial analyses and observational analyses that attempt to address similar research questions have generated many controversies in epidemiology and the social sciences. There has been little consensus on when such comparisons are reasonable, what their implications are for the validity of observational analyses, or whether trial and observational analyses can be integrated to address effectiveness questions. Here, we consider methods for using observational analyses to complement trial analyses when assessing treatment effectiveness. First, we review the framework for designing observational analyses that emulate target trials and present an evidence map of its recent applications. We then review approaches for estimating the average treatment effect in the target population underlying the emulation: using observational analyses of the emulation data alone; and using transportability analyses to extend inferences from a trial to the target population. We explain how comparing treatment effect estimates from the emulation against those from the trial can provide evidence on whether observational analyses can be trusted to deliver valid estimates of effectiveness - a process we refer to as benchmarking - and, in some cases, allow the joint analysis of the trial and observational data. We illustrate different approaches using a simplified example of a pragmatic trial and its emulation in registry data. We conclude that synthesizing trial and observational data - in transportability, benchmarking, or joint analyses - can leverage their complementary strengths to enhance learning about comparative effectiveness, through a process combining quantitative methods and epidemiological judgements.

A comparison of four quasi-experimental methods: an analysis of the introduction of activity-based funding in Ireland

Article

Full-text available

Nov 2022
BMC HEALTH SERV RES

Background Health services research often relies on quasi-experimental study designs in the estimation of treatment effects of a policy change or an intervention. The aim of this study is to compare some of the commonly used non-experimental methods in estimating intervention effects, and to highlight their relative strengths and weaknesses. We estimate the effects of Activity-Based Funding, a hospital financing reform of Irish public hospitals, introduced in 2016. Methods We estimate and compare four analytical methods: Interrupted time series analysis, Difference-inDifferences , Propensity Score Matching Difference-inDifferences and the Synthetic Control method. Specifically, we focus on the comparison between the control-treatment methods and the non-control-treatment approach, interrupted time series analysis. Our empirical example evaluated the length of stay impact post hip replacement surgery, following the introduction of Activity-Based Funding in Ireland. We also contribute to the very limited research reporting the impacts of Activity-Based-Funding within the Irish context. Results Interrupted time-series analysis produced statistically significant results different in interpretation, while the Difference-inDifferences , Propensity Score Matching Difference-inDifferences and Synthetic Control methods incorporating control groups, suggested no statistically significant intervention effect, on patient length of stay. Conclusion Our analysis confirms that different analytical methods for estimating intervention effects provide different assessments of the intervention effects. It is crucial that researchers employ appropriate designs which incorporate a counterfactual framework. Such methods tend to be more robust and provide a stronger basis for evidence-based policy-making.

Internet and Firms’ Exports and Imports: Firm level evidence from China

Article

Full-text available

Jun 2022

This study investigates the impact of the Internet on Chinese firms’ export and import performance by using China’s industrial enterprise and customs data and adopting a propensity score matching difference‐in‐differences method (PSM‐DID). The empirical results show that utilizing the Internet has positive effects on firms’ exports and imports, however, the effects are mainly concentrated in the first 2‐3 years. The positive effect on exports is larger than on domestic sales; thus, the Internet increases export intensity. Further, we investigate the effects of the Internet on the three margins of Chinese exports. First, we borrow the multi‐product multi‐destination firm exporting theory developed by Bernard et al. (2011) and find that the Internet improves not only the extensive margin between firms, but also the within‐firm extensive margin. We then investigate the effects of the Internet on product quality and find that the Internet has a negative effect since Chinese firms export more products to developing countries than to other countries, where the requirements for product quality are relatively low. Our findings empirically justify the “Internet Plus” strategy proposed by the Chinese government with regard to international trade.

Gains or losses? A quantitative estimation of environmental and economic effects of an ecological compensation policy

Article

Full-text available

May 2021
ECOL APPL

Ecological compensation is an innovative and effective tool to explore the coordinated development of socioeconomic prosperity and ecological protection, especially for a watershed crossing different regions. It converts the externalities of ecosystem services into practical financial incentives for local stakeholders. This empirical study applies a quantitative policy evaluation approach to evaluate the environmental and economic effects of an ecological compensation policy, using the paddy land–to–dry land (PLDL) program implemented in China’s Miyun Reservoir watershed as an example. The study is based on responses to a 2017 questionnaire regarding agricultural production inputs and outputs administered to 269 households in Hebei Province, where the PLDL program has been operational for over 10 yr. The results show that the program has reduced nitrogen usage by 24% on average in 2017 and decreased the total nitrogen emission load by 16.98 tons for the entire case area, which accounts for approximately 18.6% of the total nitrogen load reduction of the Miyun Reservoir basin. However, the upstream households involved in this program have experienced agricultural income losses higher than that allowed for by the current compensation criterion. Therefore, this paper discusses the factors that should be considered in the process of determining ecological compensation criteria. In particular, the paper proposes a differential compensation scheme based on the environmental effect at the individual level to avoid a standard payment for all households irrespective of their different contributions. This differential compensation payment scheme facilitates the fair treatment of environmental contributors and maximizes environmental benefits through an equitable allocation of limited ecological compensation funds. This study serves as a theoretical and practical reference for further improvement of the current ecological compensation policy in China. The study also sheds light on practices for estimating ecological compensation criteria and formulating ecological compensation policies for other regions or countries in the future.

A Quasi-Experimental Study of the Impacts of the Kids Read Now Summer Reading Program

Article

Full-text available

Dec 2020

Drawing on administrative data and reading achievement data provided by two Midwestern school districts for three schools, we analyze the literacy impacts of a replicable summer reading program, Kids Read Now. The program includes both school-based and home-based components that together encourage students to remain engaged in reading high-quality books over the summer months. We apply propensity score matching methods to match participating Kids Read Now students with similar comparison students. Our results suggest that Kids Read Now participants outperformed comparison group students, with a mean effect size of d = .12. Additional model estimates of the impacts for those students who read more of the books provided by Kids Read Now revealed that those who received all 9 books realized an effect size of d = .18 relative to the outcomes for matched comparison students. We discuss how these results might be considered in light of prior findings on summer learning.

The effect of religious belief on Chinese elderly health

Article

Full-text available

May 2020
BMC PUBLIC HEALTH

Background: With the accelerated ageing of the population in China, the health problems of elderly people have attracted much attention. Although religious belief has been shown to be a key way to improve the health of elderly people in various studies, little is known about the causal relationship between these variables in China. This paper explores the effect of religious belief on the health of elderly people in China, which will provide an important reference for China to achieve healthy ageing. Methods: Balanced panel data collected between 2012 and 2016 from the China Family Panel Studies (CFPS) were used. Health was assessed using self-rated health, and religious belief was measured by whether the respondents believed in a religion. The DID+PSM method was employed to solve the endogeneity problem caused by self-selection and omitted variables. In addition, the CESD score (replacing self-rated health) and different matching methods (the method of PSM after DID method) were used to perform the robustness test. Results: The results show that religious belief has no significant effect on the health of elderly people. With the application of different matching methods (one-to-one matching, K-nearest neighbour matching, radius matching and kernel matching) and replacing the health indicator (the CESD score) with the above matching methods, the results are still robust. Conclusion: In China, religious belief plays a limited role in promoting "healthy ageing", and it is difficult to improve the health of elderly people only via religious belief. Therefore, except for focusing on the guidance of religion with regard to healthy lifestyles, multiple measures need to be taken to improve the health of elderly people.

RAND School Leadership Intervention Evaluation Toolkit

Book

Jan 2018

Lessons Learned from Large-Scale Randomized Experiments

Article

Oct 2017

Large-scale randomized studies provide the best means of evaluating practical, replicable approaches to improving educational outcomes. This article discusses the advantages, problems, and pitfalls of these evaluations, focusing on alternative methods of randomization, recruitment, ensuring high-quality implementation, dealing with attrition, and data analysis. It also discusses means of increasing the chances that large randomized experiments will find positive effects, and interpreting effect sizes.

Can matching grants promote exports? Evidence from Tunisia's FAMEX II program.

Chapter

Full-text available

Jun 2011

The impact of Three Mexican Nutritional Programs: The Case of DIF-Puebla

Research

Full-text available

Nov 2015

Daniel Zaga Szenker

This paper presents an impact evaluation of three nutritional programs implemented in Puebla, Mexico, run by SEDIF, a social assistance institution. The present study uses both a propensity score matching and weighting in order to balance the treatment and the control groups in terms of observable characteristics, and to estimate, later on, the causal effect of the programs on different areas: food support, food orientation, education, and health. This investigation adds strong empirical evidence about the beneficial effects of nutritional programs on growth indicators (i.e. on anthropometric variables). In addition, it provides some evidence about the favorable impact of this kind of programs on food orientation outcomes, such as eating habit changes or diet diversity, variety, and quality. However, this study unveils only marginal effects on food security and detrimental effects on educational outcomes (specifically on student's marks). Finally, it does not provide conclusive effects on health.

Health Consequences of Rural-to-Urban Migration: Evidence from Panel Data in China

Article

Jun 2015
HEALTH ECON

This paper provides new empirical evidence on the health consequences of rural-to-urban migration in China. We use a panel dataset from 2003 to 2006 constructed by the Research Center on the Rural Economy at the Ministry of Agriculture in China to investigate the effects of short-term and medium-term migration on health status. By combining propensity-score matching and the difference-in-difference model, we attempt to overcome the migration endogeneity issue and estimate the average treatment effect on the treated. We find that the effect of short-term migration on health in China is significantly positive mostly because of the income effect. However, the effect of longer-term continuous migration on health is insignificant and close to zero. Our results are robust to several alternative estimation techniques and a series of robustness checks. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.

i Impacts of a Summer Learning Program: A Random Assignment Study of Building Educated Leaders for Life (BELL)

Article

Full-text available

Health Care System Reforms in Developing Countries

Article

Full-text available

Jul 2022

Wei Han

This article proposes a critical but non-systematic review of recent health care system reforms in developing countries. The literature reports mixed results as to whether reforms improve the financial protection of the poor or not. We discuss the reasons for these differences by comparing three representative countries: Mexico, Vietnam, and China. First, the design of the health care system reform, as well as the summary of its evaluation, is briefly described for each country. Then, the discussion is developed along two lines: policy design and evaluation methodology. The review suggests that i) background differences, such as social development, poverty level, and population health should be considered when taking other countries as a model; ii) although demand-side reforms can be improved, more attention should be paid to supply-side reforms; and iii) the findings of empirical evaluation might be biased due to the evaluation design, the choice of outcome, data quality, and evaluation methodology, which should be borne in mind when designing health care system reforms.

Sample Size, Effect Size, and Statistical Power: A Replication Study of Weisburd’s Paradox

Article

Full-text available

Dec 2014

Objectives. This study expands upon Weisburd’s work (1993) by reexamining the relationship between sample size and statistical power in criminological experiments. This inquiry, now known as the Weisburd paradox, postulates that increasing the sample size of experiments does not always lead to increases in statistical power. The current research also begins to explore the potential sources of the Weisburd paradox. Methods. The effect sizes and statistical power are computed for the outcome measures (n=402) of all experiments (n=66) included in systematic reviews published by the Campbell Collaboration’s Crime and Justice Coordinating Group. The design sensitivity of these experiments is reviewed by sample size, as well as other factors that may explain the variation in effect sizes and statistical power across studies. Results. Effect sizes decline as the sample size of the experiment increase, whereas statistical power is unrelated to sample size but strongly associated with effect size. Disclosure of fidelity issues and publication bias is unrelated to statistical power and treatment effects. Variability in the dependent variable and sample demographics are significantly related to statistical power, but not to effect size. Conclusions. The study finds support for the Weisburd paradox, as the ability to manipulate statistical power by increasing sample size is not as strong as statistical theory would suggest, and experiments with larger sample sizes on whole produce smaller effects. It is believed that a relationship was not observed between sample size and statistical power because the sensitivity gained from increasing sample size is offset by effect size simultaneously decreasing.

Effectiveness of a prolonged incarceration and rehabilitation measure for high-frequency offenders

Article

Full-text available

Jan 2013
J Exp Criminol

Objectives: To estimate the incapacitation effect and the impact on post-release recidivism of a measure combining prolonged incarceration and rehabilitation, the ISD measure for high frequency offenders (HFOs) was compared to the standard practice of short-term imprisonment. Methods: We applied a quasi-experimental design with observational data to study the effects of ISD. The intervention group consisted of all HFOs released from ISD in the period 2004–2008. Two control groups were derived from the remaining population of HFOs who were released from a standard prison term. To form groups of controls, a combination of multiple imputation (MI) and propensity score matching (PSM) was used including a large number of covariates. In order to measure the incapacitation effect of ISD, the number of convictions and recorded offences in a criminal case of the controls were counted in the same period as their ISD counterfactuals were incarcerated. The impact on recidivism was measured by the prevalence and the frequency of reconvictions corrected for time at risk. Robustness of the results were checked by performing a combined PSM and difference-in-difference (DD) design. Results: The estimate of the incapacitation effect was on average 5.7 criminal cases and 9.2 offences per ISD measure. On average 2.5 convictions and 4 recorded offences per year per HFO are prevented. The HFOs released from ISD showed 12 to 16 % lower recidivism rates than their control HFOs released from prison (Cohen’s h = 0.3–0.4). The recidivists of the ISD group also showed a lower reconviction frequency than the control group recidivists (Cohen’s d = 0.2). Conclusions: The ISD measure seems to be effective in reducing recidivism and crime. The estimated incapacitation effect showed that a large portion of criminal cases and offences was prevented. DD analysis and sensitivity analyses confirmed the robustness of the PSM results. Due to the absence of actual treatment data, the effects found cannot be attributed separately to resocialization, imprisonment, or improvement of life circumstances.

Making Valid Causal Inferences About Corrective Actions by Parents From Longitudinal Data

Article

Dec 2013

As a result of an inherent selection bias, most longitudinal analyses are biased against corrective actions that parents use to address perceived child problems. This bias can lead to unjustified or even counterproductive recommendations about corrective parental actions. To overcome this bias, this article summarizes current scholarship on improving the validity of causal inferences. Enhancing research designs is preferred, using quasi-experimental design components and natural experiments. Comparing a typical longitudinal design with a one-group pre-post design shows how longitudinal designs can be improved to enhance causal validity. Perfect statistical controls for confounds or perfect instrumental variables to circumvent them could produce unbiased causal evidence. Strategies to approximate that ideal are summarized, as well as methods to check for the adequacy of those approximations.

Perspectives on Evidence-Based Research in Education—What Works? Issues in Synthesizing Educational Program Evaluations

Article

Jan 2008

Robert E. Slavin

Syntheses of research on educational programs have taken on increasing policy importance. Procedures for performing such syntheses must therefore produce reliable, unbiased, and meaningful information on the strength of evidence behind each program. Because evaluations of any given program are few in number, syntheses of program evaluations must focus on minimizing bias in reviews of each study. This article discusses key issues in the conduct of program evaluation syntheses: requirements for research design, sample size, adjustments for pretest differences, duration, and use of unbiased outcome measures. It also discusses the need to balance factors such as research designs, effect sizes, and numbers of studies in rating the overall strength of evidence supporting each program.

Using systematic reviews to improve social care

Article

Full-text available

Geraldine Macdonald

Effective Programs in Middle and High School Mathematics: A Best-Evidence Synthesis

Article

Full-text available

Jun 2009

This article reviews research on the achievement outcomes of mathematics programs for middle and high schools. Study inclusion requirements include use of a randomized or matched control group, a study duration of at least 12 weeks, and equality at pretest. There were 100 qualifying studies, 26 of which used random assignment to treatments. Effect sizes were very small for mathematics curricula and for computer-assisted instruction. Positive effects were found for two cooperative learning programs. Outcomes were similar for disadvantaged and nondisadvantaged students and for students of different ethnicities. Consistent with an earlier review of elementary programs, this article concludes that programs that affect daily teaching practices and student interactions have more promise than those emphasizing textbooks or technology alone.

Synthesis Effective Programs in Elementary Mathematics: A Best-Evidence

Article

Full-text available

Sep 2008

This article reviews research on the achievement outcomes of three types of approaches to improving elementary mathematics: mathematics curricula, computer-assisted instruction (CAI), and instructional process programs. Study inclusion requirements included use of a randomized or matched control group, a study duration of at least 12 weeks, and achievement measures not inherent to the experimental treatment. Eighty-seven studies met these criteria, of which 36 used random assignment to treatments. There was limited evidence supporting differential effects of various mathematics textbooks. Effects of CAI were moderate. The strongest positive effects were found for instructional process approaches such as forms of cooperative learning, classroom management and motivation programs, and supplemental tutoring programs. The review concludes that programs designed to change daily teaching practices appear to have more promise than those that deal primarily with curriculum or technology alone.

Study Quality Assessment in Systematic Reviews of Research on Intervention Effects

Article

Full-text available

Feb 2008
RES SOCIAL WORK PRAC

Objective: The goal of this study is to advance an approach to the assessment of the quality of studies considered for inclusion in systematic reviews of the effects of social-care interventions. Method: To achieve this objective, quality is defined in relation to the widely accepted validity typology; prominent approaches to study quality assessment are evaluated as to their adequacy. Results: Problems with these approaches are identified. Conclusion: A formal, yet explicit, multidimensional approach to assessment grounded in substantive issues relevant to the intervention and the broader context in which it is embedded is promoted. Uncritical and exclusive use of indicators of study quality such as publication status, reporting quality, and single summative quality scores are rejected.

Large-Scale Social Experimentation in Britain: What Can and Cannot be Learnt from the Employment Retention and Advancement Demonstration?

Article

Apr 2005
Evaluation

The Employment Retention and Advancement (ERA) Demonstration programme is a major current welfare-to-work social experiment, the largest random allocation evaluation ever mounted in Great Britain. This article draws on experience gained in designing the ERA Demonstration to explore the strengths and limitations of social experimentation for policy evaluation and analysis. The focus of the discussion is on the reasons for the choice of random allocation as a mean of estimating programme impacts, contrasting this approach with the alternatives. The weaknesses of random allocation designs are also examined in the light of the types of information policy-makers require from evaluations of labour market programmes and social policy demonstrations. The perennial ‘black box’ problem and the difficulties in generalizing from social experiments are given particular prominence.

Impacts of after-school programs on student outcomes: A systematic review for the Campbell collaboration

Article

Full-text available

Jan 2006

Does the development of China's high-speed rail improve the total-factor carbon productivity of cities?

Article

Apr 2022
TRANSPORT RES D-TR E

China's high-speed rail (HSR) has developed expeditiously since the beginning of the 21st century, exerting significant influences on many aspects, such as economic development and residents' travel mode. Improving carbon productivity is one of the necessary measures to realize China's carbon neutrality goal, considering economic growth. However, previous researches have not dealt with the exact impact of HSR opening on carbon emission performance. This study seeks to fill this gap. Through the difference-in-differences (DID) model, we discover that the opening of HSR significantly improves the city's total-factor carbon productivity in China. In addition, the influencing mechanism and heterogeneity of the impact are discussed, and the polarization effect is also analyzed. Overall, this study strengthens the idea that HSR construction has positive environmental externalities. The insights gained from this study may be of assistance to accurately formulate policies related to HSR planning and construction in the future.

Can Restoration of the Commons Foster Resilience? A Quasi-Experimental Comparison of COVID-19 Coping Strategies among Rural Households in Three Indian States

Article

Full-text available

Jan 2021

Within study comparisons and risk of bias in international development: Systematic review and critical appraisal

Article

Full-text available

Jul 2019

Executive Summary Background Many systematic reviews incorporate nonrandomised studies of effects, sometimes called quasi‐experiments or natural experiments. However, the extent to which nonrandomised studies produce unbiased effect estimates is unclear in expectation or in practice. The usual way that systematic reviews quantify bias is through “risk of bias assessment” and indirect comparison of findings across studies using meta‐analysis. A more direct, practical way to quantify the bias in nonrandomised studies is through “internal replication research”, which compares the findings from nonrandomised studies with estimates from a benchmark randomised controlled trial conducted in the same population. Despite the existence of many risks of bias tools, none are conceptualised to assess comprehensively nonrandomised approaches with selection on unobservables, such as regression discontinuity designs (RDDs). The few that are conceptualised with these studies in mind do not draw on the extensive literature on internal replications (within‐study comparisons) of randomised trials. Objectives Our research objectives were as follows: Objective 1: to undertake a systematic review of nonrandomised internal study replications of international development interventions. Objective 2: to develop a risk of bias tool for RDDs, an increasingly common method used in social and economic programme evaluation. Methods We used the following methods to achieve our objectives. Objective 1: we searched systematically for nonrandomised internal study replications of benchmark randomised experiments of social and economic interventions in low‐ and middle‐income countries (L&MICs). We assessed the risk of bias in benchmark randomised experiments and synthesised evidence on the relative bias effect sizes produced by benchmark and nonrandomised comparison arms. Objective 2: We used document review and expert consultation to develop further a risk of bias tool for quasi‐experimental studies of interventions (ROBINS‐I) for RDDs. Results Objective 1: we located 10 nonrandomised internal study replications of randomised trials in L&MICs, six of which are of RDDs and the remaining use a combination of statistical matching and regression techniques. We found that benchmark experiments used in internal replications in international development are in the main well‐conducted but have “some concerns” about threats to validity, usually arising due to the methods of outcomes data collection. Most internal replication studies report on a range of different specifications for both the benchmark estimate and the nonrandomised replication estimate. We extracted and standardised 604 bias coefficient effect sizes from these studies, and present average results narratively. Objective 2: RDDs are characterised by prospective assignment of participants based on a threshold variable. Our review of the literature indicated there are two main types of RDD. The most common type of RDD is designed retrospectively in which the researcher identifies post‐hoc the relationship between outcomes and a threshold variable which determines assignment to intervention at pretest. These designs usually draw on routine data collection such as administrative records or household surveys. The other, less common, type is a prospective design where the researcher is also involved in allocating participants to treatment groups from the outset. We developed a risk of bias tool for RDDs. Conclusions Internal study replications provide the grounds on which bias assessment tools can be evidenced. We conclude that existing risk of bias tools needs to be further developed for use by Campbell collaboration authors, and there is a wide range of risk of bias tools and internal study replications to draw on in better designing these tools. We have suggested the development of a promising approach for RDD. Further work is needed on common methodologies in programme evaluation, for example on statistical matching approaches. We also highlight that broader efforts to identify all existing internal replication studies should consider more specialised systematic search strategies within particular literatures; so as to overcome a lack of systematic indexing of this evidence.

PROTOCOL: The Effectiveness of Volunteer Tutoring Programs: A Systematic Review for the Campbell Collaboration Education Review Group

Article

Full-text available

Dec 2005

Rural land rights reform and agro-environmental sustainability: Empirical evidence from China

Article

May 2018
LAND USE POLICY

The landscape of China’s rural land market has been changed by several significant land right reforms since the 1970s. It is always of great interest to both the government and the public to gauge the effectiveness of these reforms. We address this question by investigating the impact of a recent land use right reform, namely, the ‘Three Rights Separation Policy’, on agro-environmental sustainability. By separating land management right from land contracted management right, this new reform is believed to be a powerful tool to encourage land transfer, optimize land resource allocation, and increase the economy of scale in the agriculture sector. Using a PSM-DID model applied to panel data for the years 2008 and 2014, our study demonstrates that the new policy also increases the use of organic fertilizers by 48.641 kg/mu in total, which is a very important step to ensure agro-environmental sustainability in China. The new policy is more effective in encouraging the application of organic fertilizer when the issuing of land certificates is enforced and administrative barriers to land right transfers are removed. The findings add value to the growing literature on rural land right reforms in China and may also have significant implications in developing countries with similar rural land tenure systems and underdeveloped land and labor markets.

Methods of Impact Evaluation: A Review

Article

Full-text available

Apr 2017

Effect of Parental Migration on Children's Health in Rural China

Article

Oct 2016

Using 2003–2006 RCRE (Research Center for Rural Economy) panel data, we estimate the effect of parental migration on the health of children left behind, with a difference-in-differences and propensity score matching combined model. On average we do not find any significant effect on children's health; however, the effect varies among different groups. Children's health may improve as a result of parental migration in families with lower income in the base year and families with higher-income growth rates. Furthermore, children's health may deteriorate with maternal migration but improve with longer distance of paternal migration and longer time of paternal migration. We argue that parental migration affects children's health through complex mechanisms: income increase may have a positive impact while decreased parental care may have a negative effect. The two effects seem to offset each other in rural China.

Lexical Ambiguity in Algebra, Method of Instruction as Determinant of Grade 9 Students’ Academic Performance in East London District

Article

Full-text available

Nov 2014

Olabisi Olaoye

In the domain of mathematics education there have been series of debates on lexical ambiguity in algebra especially with the resurgence of mathematics educators’ awareness of the relevance of language in mathematics education. Therefore, this study investigated lexical ambiguity in algebra, method of teaching as determinant of grade 9 students’ academic performance in East London. A pre-test-post-test- quasi-experimental group design was adopted in the study. A sample of 109 students was involved in the study. The instruments adopted and structured for the study were lexical ambiguity questionnaire (LAAQ). Method of Instruction Questionnaire (MIQ) Problem Based Learning Strategies in two parts (PBLSa) and (PBLSb), Conventional Teaching Guide (C.T.G). They were tested at .05 level of significance using a two-way (2 x 2) Analysis of Covariance (ANCOVA). The findings showed that students exposed to the PBLS achieved higher than their counterparts that were exposed to the Conventional method. Multiple Comparison Analysis and Tukey post-hoc were employed to detect the source of variation and the direction of significance. The findings also revealed that lexical ambiguity determines students’ academic performance (r=0.422; P<0.05); effect of the experiment on students post-test performance scores in lexical ambiguity (F (2,109) =.926; P< 0.05). Method of teaching is also said to be the determinant of students’ performance (r=0.764, P<0.05). Hence, there is need for teachers to update their knowledge about the problem solving skills that can be used as a remedy to mathematics phobia and ambiguities in algebra word problem; it should also be enshrined into the school curriculum. DOI: 10.5901/mjss.2014.v5n23p897

Using State Tests in Education Experiments: A Discussion of the Issues. Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education

Article

Full-text available

Securing data on studentsâ€™ academic achievement is typically one of the most important and costly aspects of conducting education experiments. As state assessment programs have become practically universal and more uniform in terms of grades and subjects tested, the relative appeal of using state tests as a source of study outcome measures has grown. However, the variation in state assessmentsâ€”in both content and proficiency standardsâ€”complicates decisions about whether a particular state test is suitable for research purposes and poses difficulties when planning to combine results across multiple states or grades. This paper aims to help researchers evaluate and make decisions about whether and how to use state test data in education experiments. It outlines the issues that researchers should consider, including how to evaluate the validity and reliability of state tests relative to study purposes; factors influencing the feasibility of collecting state test data; how to analyze state test scores; and whether to combine results based on different tests. It also highlights best practices to help inform ongoing and future experimental studies. Many of the issues discussed are also relevant for nonexperimental studies.

National Job Corps Study and Longer-Term Follow-Up Study: Impact and Benefit-Cost Findings Using Survey and Summary Earnings Records Data. Princeton, NJ: Mathematica Policy Research

Article

Kinderschutz in Deutschland stärken Analyse des nationalen und internationalen Forschungsstandes zu Kindeswohlgefährdung und die Notwendigkeit eines nationalen Forschungsplanes zur Unterstützung der Praxis

Article

Full-text available

Heinz Kindler

ESTIMATING THE IMPACT OF A RUSSIAN JOB SEARCH PROGRAM TARGETED ON THE UNEMPLOYED IN VERY LOW-INCOME FAMILIES

Article

Jan 2006
INT LABOUR REV

Using State Tests in Education Experiments: A Discussion of the Issues. NCEE 2009-013

Article

Full-text available

Jan 2009

Securing data on students' academic achievement is typically one of the most important and costly aspects of conducting education experiments. As state assessment programs have become practically universal and more uniform in terms of grades and subjects tested, the relative appeal of using state tests as a source of study outcome measures has grown. However, the variation in state assessments--in both content and proficiency standards--complicates decisions about whether a particular state test is suitable for research purposes and poses difficulties when planning to combine results across multiple states or grades. This discussion paper aims to help researchers evaluate and make decisions about whether and how to use state test data in education experiments. It outlines the issues that researchers should consider, including how to evaluate the validity and reliability of state tests relative to study purposes; factors influencing the feasibility of collecting state test data; how to analyze state test scores; and whether to combine results based on different tests. It also highlights best practices to help inform ongoing and future experimental studies. Many of the issues discussed are also relevant for nonexperimental studies. Appendices include: (1) State Testing Programs Under NCLB; (2) How NCEE-Funded Evaluations Use State Test Data. (Contains 35 footnotes and 4 tables.)

Encouraging the flight of error: Ethical standards, evidence standards, and randomized trials

Article

Mar 2007
New Dir Eval

Robert Boruch

Thomas Jefferson recognized the value of reason and scientific experimentation in the eighteenth century. This chapter extends the idea in contemporary ways to standards that may be used to judge the ethical propriety of randomized trials and the dependability of evidence on effects of social interventions.

Eficacia de un programa de colocación para familias pobres de Rusia

Article

Jun 2008

When Will We Ever Learn? Recommendations to Improve Social Development through Enhanced Impact Evaluation

Article

Full-text available

Oct 2005

The Endogeneity Problem in Developmental Studies

Article

Full-text available

Mar 2004

Estimates of developmental models of processes involving contextual influences (e.g., child care arrangements, divorce, parenting, neighborhood location, peers) are subject to bias if, as is often the case, the contexts are influenced by the actions of ei-ther the individuals being studied or their parents or teachers. We assessed the nature of the endogeneity biases that may result, discuss the importance of such biases in practice, and suggest possible ways of avoiding them. Our primary recommendation is that developmentalists consider reorienting their data collection strategies to take advantage of real or "natural" experiments that produce exogenous variation in fam-ily and contextual variables of interest. Individuals'lives are shaped by a rich set of interactive genetic, social, structural, and historical forces and processes. Consequently, developmental science places high demands on the evidence needed to separate correlation from causation. Although social science theory can commonly be invoked to limit the scope of problems and isolate key variables, a developmental perspective often does just the opposite. Be-cause a broad theoretical perspective holds great promise for advancing researchers' understanding of human development, developmental scientists should not be sim-plifying their theories for the sake of empirical tractability. Instead, they should de-vote themselves to ensuring that their empirical work does justice to the theory.

Impact evaluation for slum upgrading interventions

Article

Jan 2006

Are Business Start-up Subsidies Effective for the Unemployed: Evaluation of Enterprise Allowance

Article

Jan 2006

Geoff Perry

Comparing Experimental and Matching Methods Using a Large-Scale Field Experiment on Voter Mobilization

Article

Full-text available

Dec 2006

In the social sciences, randomized experimentation is the optimal research design for establishing causation. However, for a number of practical reasons, researchers are sometimes unable to conduct experiments and must rely on observational data. In an effort to develop estimators that can approximate experimental results using observational data, scholars have given increasing attention to matching. In this article, we test the performance of matching by gauging the success with which matching approximates experimental results. The voter mobilization experiment presented here comprises a large number of observations (60,000 randomly assigned to the treatment group and nearly two million assigned to the control group) and a rich set of covariates. This study is analyzed in two ways. The first method, instrumental variables estimation, takes advantage of random assignment in order to produce consistent estimates. The second method, matching estimation, ignores random assignment and analyzes the data as though they were nonexperimental. Matching is found to produce biased results in this application because even a rich set of covariates is insufficient to control for preexisting differences between the treatment and control group. Matching, in fact, produces estimates that are no more accurate than those generated by ordinary least squares regression. The experimental findings show that brief paid get-out-the-vote phone calls do not increase turnout, while matching and regression show a large and significant effect.

Can Personal Financial Management Education Promote Asset Accumulation by the Poor?

Article

Full-text available

Mar 2006

John Caskey

This paper asks whether personal financial management education is an effective mechanism for helping lower-income households accumulate financial assets and improve credit histories. The paper argues that the best existing studies of the effectiveness of financial literacy initiatives suggest that such initiatives might help lower-income households build savings and improve credit records, but the results are only suggestive due to the limitations of the studies. The paper concludes that a high research priority should be to gathering more robust evidence on whether teaching personal financial management skills to lower-income households can be an effective means to improve their financial situations.

Nonexperimental Replications of Social Experiments: A Systematic Review

No full-text available

Recommended publications

Using video modeling to teach academic skills to students with disabilities: a review of the literat...