Table 3 - uploaded by Mei-kuang Chen
Content may be subject to copyright.
Model descriptions by Research Question Model

Model descriptions by Research Question Model

Source publication
Article
Full-text available
In this study we explored the effects of statistical controls, single versus multiple cohort models, and student sample size on the stability of teacher value-added estimates (VAEs). We estimated VAEs for all 5th grade mathematics teachers in a large urban district by fitting two level mixed models using four cohorts of student data. We found that...

Contexts in source publication

Context 1
... we describe for each research question, which specific models were fit. Table 3 summarizes the different models by research question. ...
Context 2
... are the effects of statistical controls on the stability of teacher VAEs? To answer this research question, we fit four 2-level, single cohort, models to the 2007 cohort data (denoted Model 1 through 4 in Table 3). We varied the statistical controls included (previous year's test scores only versus test scores from two prior years, and student background characteristics), and we also compared two different strategies for controlling, either using test scores only, or a combination of test scores and other student characteristics known to affect learning. ...
Context 3
... used these single cohort VAEs to evaluate stability of VAEs over time for teachers (N = 648) with four student cohorts. We also fit a multiple cohort model using all four cohorts (Model 5 in Table 3), in which we again controlled only for previous year test scores. We compared VAEs from each single cohort with the stable VAE obtained from the multiple cohort model. ...

Similar publications

Article
Full-text available
A bstract The Kähler potentials of modular symmetry models receive unsuppressed contributions which may be controlled by a flavor symmetry, where the combination of the two symmetry types is referred to as eclectic flavor symmetry. After briefly reviewing the consistency conditions of eclectic flavor symmetry models, including those with generalise...

Citations

... One indicator of this would be to examine whether the MGPs themselves aligned with other indicators of teacher quality derived, as it was in this case, via the observations of the same teachers' practice at the same time using the TAP. Again, doing this is very common (Bill & Melinda Gates Foundation, 2013;Curtis, 2011;Goldring et al., 2015;Grossman et al., 2014;Harris, 2012;Hill, 2009;Hill et al., 2011;Kane & Staiger, 2012;Kimball et al., 2004;Kersting et al., 2013;Martínez et al., 2016; Measures of Effective Teaching (MET), 2013; Polikoff & Porter, 2014;Rothstein & Mathis, 2013;Sass & Harris, 2012;Strunk et al., 2014). ...
Article
Full-text available
In this study, we estimated the relationship between two popular measures of teacher effectiveness—teachers’ value-added model (VAM) estimates, represented in this study via median growth percentiles (MGPs), and teachers’ observational scores, derived from the TAP System for Teacher and Student Advancement. We examined the relationship between these measures separately for teachers at different values of the MGPs distribution, as opposed to examining a conventional aggregated correlation. Previous research has shown that these aggregated (co)relationships are typically very weak, but in this study, we demonstrate that such correlations vary by the location of the score in the overall score distribution, and that the relationship between teachers’ MGPs and observational scores is non-linear. This suggests that when these two indicators of teacher quality are used together, especially for high-stakes decision-making purposes (e.g., tenure, merit pay, termination), teachers’ positions in the score distribution should be considered in the design of any teacher evaluation system or formulation.
... However, by using student test scores for the same teacher for multiple years, as in this current study, it can be ensured that the effects (if any) can be attributed to the teacher. In the VAM concept, teacher effect has a special meaning that relates to evaluating the discrepancies between expected and observed student test scores [3,6]. ...
... Hu [12] recorded that on average, 57% and 59% of the changes in student math and reading test scores can be explained by the closest lagged score in the related subjects, respectively. Similarly, Kersting et al. [6] found that just one test score from a previous year explained 68% of the variation in the students' actual scores. In accordance with the scope of this study, Rothstein [13] analyzed the impact of variables on the value-added estimates through modifying the models' R-squared by employing 28 contextual variables, such as ethnicity, sex, free/reduced lunch status, parental education, etc. ...
... Although Alban [14] and Gagnon [15] report that adding student-level variables in the estimates improves predictability, most studies suggest that the influence of variables at the student level employed in the equations is negligible [6,12,[19][20][21][22][23]. Gagnon [15] analyzed the use of different predictors at the student level in the value-added teacher effectiveness estimates, including lagged scores, poverty, ethnicity, sex, English as a second language, disability status, attendance and suspension. ...
Article
Full-text available
It is widely believed that the teacher is one of the most important factors influencing a student's success at school. In many countries, teachers' salaries and promotion prospects are determined by their students' performance. Value-added models (VAMs) are increasingly used to measure teacher effectiveness to reward or penalize teachers. The aim of this paper is to examine the relationship between teacher effectiveness and student academic performance, controlling for other contextual factors, such as student and school characteristics. The data are based on 7543 Grade 8 students matched with 230 teachers from one province in Turkey. To test how much progress in student academic achievement can be attributed to a teacher, a series of regression analyses were run including contextual predictors at the student, school and teacher/classroom level. The results show that approximately half of the differences in students' math test scores can be explained by their prior attainment alone (47%). Other factors, such as teacher and school characteristics explain very little the variance in students' test scores once the prior attainment is taken into account. This suggests that teachers add little to students' later performance. The implication, therefore, is that any intervention to improve students' achievement should be introduced much earlier in their school life. However, this does not mean that teachers are not important. Teachers are key to schools and student learning, even if they are not differentially effective from each other in the local (or any) school system. Therefore, systems that attempt to differentiate "effective" from "ineffective" teachers may not be fair to some teachers.
... where teacher effectiveness operationally defined by VAM as the estimation of the differences between expected and observed student test scores (Kersting, Chen and Stigler, 2013). Moreover, in this systematic review study, the operational definitions of the term stability refer to the stableness of the estimates due to (a) the number of test scores used, (b) the predictors used in the estimations, and (c) the analysis methods applied. ...
Conference Paper
Full-text available
This article provides evidence by undertaking a systematic review on the stability problem of using value-added models in teacher effectiveness estimates from the perspective of the impact of the number of previous test scores employed aimed at answering a unique review question: How stable is teacher effectiveness estimates measured by VAMs? By using the terms: teacher performance, student performance, value-added model, stability and their other related synonyms, a comprehensive search was conducted in 17 databases along with employing hand search in Google Scholar and contacting authorised persons by email. In total 1439 records were found as a result of the searches. After completing the screening process, 50 studies remained for data extraction. Out of 50 a total of studies in the review list, 13 focused on the stability of VAM estimates regarding using the number of prior test scores. In summary, there is a common view that the use of prior year data in on value-added estimates for teacher effectiveness has a positive impact, however, with regard to the impact of multiple previous year data, different voices arose from the researchers.
... Research on VAM has consistently found issues that should give pause to policymakers implementing VAMs for evaluative purposes. Studies have shown there are myriad concerns with the validity of VAM due to issues with model specification (e.g., Amrein-Beardsley, 2008;Goldhaber et al., 2013;Hill et al., 2011;Kersting et al., 2013;Schochet& Chiang, 2011) as well as the validity threats posed by other matters like test selection and timing (Papay, 2011). Bearing this body of research in mind, it is critical for state policies regarding VAM to be closely scrutinized. ...
... It also showed that the correlations for teachers from year-to-year significantly varied by subject matter, posing additional concerns about the fairness of VAM for all teachers. Furthermore, since we do not have sufficient information about the model, we are left with the myriad concerns related to model specification and fit that are prevalent in the research on VAM (e.g., Amrein-Beardsley, 2008;Goldhaber et al., 2013;Hill et al., 2011;Kersting et al., 2013;Schochet& Chiang, 2011). ...
... While research on the stability of VA scores indicates low to moderate stability measures of VA scores (correlations between years ranging from .2 to .66; see, for example, Kersting et al. 2013), the authors also argue that the assumption that teachers do not change over time is unreasonable, indicating that even a benchmark of .8 would already be too high. For a more exhaustive discussion of limitations and their implications, see, for example, Everson (2017) or Perry (2016). ...
Article
Full-text available
Value-added (VA) modeling can be used to quantify teacher and school effectiveness by estimating the effect of pedagogical actions on students’ achievement. It is gaining increasing importance in educational evaluation, teacher accountability, and high-stakes decisions. We analyzed 370 empirical studies on VA modeling, focusing on modeling and methodological issues to identify key factors for improvement. The studies stemmed from 26 countries (68% from the USA). Most studies applied linear regression or multilevel models. Most studies (i.e., 85%) included prior achievement as a covariate, but only 2% included noncognitive predictors of achievement (e.g., personality or affective student variables). Fifty-five percent of the studies did not apply statistical adjustments (e.g., shrinkage) to increase precision in effectiveness estimates, and 88% included no model diagnostics. We conclude that research on VA modeling can be significantly enhanced regarding the inclusion of covariates, model adjustment and diagnostics, and the clarity and transparency of reporting.
... One indicator of this would be to examine whether the MGPs themselves align with other indicators of teacher quality derived, as also in this case, via the observations of the same teachers' practice at the same time using the TAP. Doing this is common, as well, across many other studies (see, for example, Goldring et al., 2015;Grossman et al., 2014;Hill et al., 2011;Kane & Staiger, 2012;Kersting et al., 2013; Measures of Effective Teaching (MET) Project, 2013; Polikoff & Porter, 2014;Rothstein & Mathis, 2013;Sass & Harris, 2012;Strunk et al., 2014); although, there are some researchers who disagree that these indicators should map onto the same construct given, for example, teaching is such a dynamic construct (see, for example, Braun, Goldschmidt, McCaffrey, Lissitz, 2012;Good, 2014;Harris, 2012;Kennedy, 2010;Martinez, Schweig, & Goldschmidt, 2016). ...
... More specifically, the correlations being observed among mathematics and English/ language arts (ELA) teachers' VAM-based estimates and either their observational scores or student surveys of teacher quality are low to moderate in magnitude (e.g., 0.2 ≤ r ≤ 0.5; see, for example, American Statistical Association (ASA) 2014; Bill and Melinda Gates Foundation 2013;Curtis 2011;Graue et al. 2013;Harris 2011;Hill et al. 2011;Jacob and Lefgren 2005;Kimball et al. 2004;Kersting et al. 2013;Kyriakides 2005;Milanowski et al. 2004;Loeb et al. 2015;Nye et al. 2004;Polikoff and Porter 2014;Rothstein and Mathis 2013). While some argue that the Bmore subjectivem easures (e.g., supervisors' observational scores, student or parent surveys) are at fault, others argue that all of the measures, including VAM estimates, are at fault for the low correlations observed, because all of the measures typically used to examine teacher effects are limited and insufficient. ...
Article
Full-text available
In this study, researchers compared the concordance of teacher-level effectiveness ratings derived via six common generalized value-added model (VAM) approaches including a (1) student growth percentile (SGP) model, (2) value-added linear regression model (VALRM), (3) value-added hierarchical linear model (VAHLM), (4) simple difference (gain) score model, (5) rubric-based performance level (growth) model, and (6) simple criterion (percent passing) model. The study sample included fourth to sixth grade teachers employed in a large, suburban school district who taught the same sets of students, at the same time, and for whom a consistent set of achievement measures and background variables were available. Findings indicate that ratings significantly and substantively differed depending upon the methodological approach used. Findings, accordingly, bring into question the validity of the inferences based on such estimates, especially when high-stakes decisions are made about teachers as based on estimates measured via different, albeit popular methods across different school districts and states.
... Intertemporal stability refers to the situations when teachers who are classified as effective one year might be classified as ineffective the next, or vice versa (see, for example, Au, 2011;Koedel & Betts, 2007;Schochet & Chiang, 2010, 2013. Concerns about validity arise when different measures of teacher effectiveness that theoretically map onto the same teacher effectiveness construct (e.g., observational scores or student/parent survey results), do not demonstrate substantive statistical relationship with MGPs (see, for example, Goldring et al., 2015;Grossman, Cohen, Ronfeldt, & Brown, 2014;Hill, Kapitula, & Umlan, 2011;Kane & Staiger, 2012;Kersting, Chen, & Stigler, 2013; Measures of Effective Teaching (MET) Project, 2013; Polikoff & Porter, 2014;Sass & Harris, 2012;Strunk, Weinstein, & Makkonen, 2014). ...
... One indicator of this would be to examine whether the MGPs themselves align with other indicators of teacher quality derived, for instance, via the observations of the same teachers' practice at the same time. Doing this is common, as well, across many of the studies researchers are continuously conducting on SGMs, in general (see, for example, Goldring et al., 2015;Grossman et al., 2014;Hill et al., 2011;Kane & Staiger, 2012;Kersting et al., 2013;Measures of Effective Teaching (MET) Project, 2013;Polikoff & Porter, 2014;Rothstein & Mathis, 2013;Sass & Harris, 2012;Strunk et al., 2014); although, there are some researchers who disagree that these indicators should map onto the same construct given, for example, teaching is such a dynamic construct (see, for example, Braun, Goldschmidt, McCaffrey, & Lissitz, 2012;Good, 2014;Harris, 2012;Kennedy, 2010;Martinez et al., 2016). ...
... Given that no benchmark exists to compare these estimates to other estimates from the literature in terms of MGPs, however, the best we could do is to benchmark these correlations against other VAM-based estimates, or estimates of performance stability measures in other skilled jobs. To date, the estimates of teacher-level VAM correlations typically fall within the range of 0.2 to 0.5, and generally lower than 0.5 (Di Carlo, 2013;Goldhaber & Hansen, 2010;Kane & Staiger, 2012;Kersting et al., 2013;Koedel & Betts, 2007;McCaffrey, Sass, Lockwood, & Mihaly, 2009;Newton, Darling-Hammond, Haertel, & Thomas, 2010;Sass, 2008). To put the findings in this study into the perspective, we found that intertemporal correlations for teacher effectiveness derived from the SGP model fall into the upper bound of this range. ...
Article
While states are no longer required to set up teacher evaluation systems based in significant part on student test scores, quite a few continue to use value-added (VAMs) or student growth percentile (SGP) models for that purpose. In this study, we analyzed three years of teacher data to illustrate the performance of teachers’ median growth percentiles (MGPs)). We found MGP’s consistency over time to be comparable with the existing estimates from the value-added models (VAMs). Additionally, we found that MGPs do not substantively agree with another measure of teacher quality – teachers’ observational scores. These findings suggest that caution should be exercised when teacher’s MGPs, as well as VAMs, are used in teacher evaluation system to make high-stakes decisions such as merit pay, tenure, or teacher contract termination. Our findings about the correlation of MGPs with observational scores support the idea of the multidimensional nature of teacher effectiveness construct.
... La búsqueda de una medida confiable y comparable del aporte generado por una institución o un programa académico de educación superior, enfrenta múltiples retos. En primer lugar, están los retos relacionados con la confiabilidad estadística de la medida (Thomas et al, 1996;Kersting et al., 2013). En segundo lugar, están las dificultades conceptuales relacionadas con la interpretación del indicador mencionados en la sección anterior (McCaffrey et al, 2004;McCaffrey et al 2009, Koedel y Betts, 2011. ...
... En tercer lugar, están los análisis de los programas y políticas que pueden ser implementados a partir de los resultados en las mediciones del aporte. En cuarto lugar, está la capacidad de la medida de transmitir información sobre los cambios institucionales a lo largo del tiempo de manera estable (Kersting et al., 2013;Steedle, 2010;Goldhaber et al 2013;McCaffrey et al 2004;McCaffrey et al., 2009, Ballou et al., 2012. 10 ...
... 14 Los errores de medida a nivel de plantel sugieren que una fracción de la varianza no es explicada por los factores incluidos y de ahí que sean requeridos varios ajustes. Los ajustes sugeridos en la literatura van desde incrementar el número de observaciones (mayor número de años en las estimaciones, por ejemplo), ajustar los coeficientes con procedimientos bayesianos (shrinkage) o incluir intervalos de confianza para poder hacer pruebas estadísticas (Kane y Staiger, 2002, 2008Kersting et al., 2013). ...
... The error at the teacher level (u oj ) also is assumed to be normally distributed with mean 0 and variance τ α 2 . Although this model has been used to obtain teacher estimates by previous research (Kersting et al. 2013;Newton et al. 2010), it is important to highlight that the teacher effect estimates may need to be corrected due to a possible endogeneity problem (see Manzi et al. 2013). ...
Article
Full-text available
This research examines empirically the relationship between two measures of teacher quality: one based on professional standards and a second one using teacher value-added estimates. It also studies the extent to which teacher observable characteristics, such as teacher training variables, are associated to better performance on either of these measures and whether any of these two assessments is able to effectively measure teacher quality isolated from the effect of the context where teachers work. Context in this article is defined as any variable that is not under the direct control of the teacher but plays an important role on student learning and we believe is captured by school and municipal variables. The study uses hierarchical linear models and information from national and standardized assessments from Chile, specifically from the municipal education sector. Results show a small correlation between the two measures of teacher quality, in the lower end of results from previous studies conducted in the USA, and suggest that there is only a limited relationship between both measures of teacher quality. Teacher initial education type and professional development were statistically associated only to the standard-based measure of teacher quality. Context (both the school and municipal levels) plays an important role in the teacher effect measure, and in the standard-based measures, therefore, we conclude that neither of these measures are context-free. We expect that these results will contribute to the discussion about how to best measure teacher quality and how to evaluate teacher performance both in Chile and other parts of the world.