Individual Submission Summary
Share...

Direct link:

The causal effect of re-evaluating lower-performing teachers in Chile

Thursday, November 13, 8:30 to 10:00am, Property: Hyatt Regency Seattle, Floor: 5th Floor, Room: 506 - Samish

Abstract

Performance evaluations for teachers are a ubiquitous practice implemented across countries as a means to assess employee skill, make personnel decisions, provide feedback, and target incentives (Taylor, 2023). Evidence, however, is mixed on the efficacy of such systems for improving student outcomes (Bleiberg et al., 2024; Dee & Wyckoff, 2015; Lombardi, 2019; Taylor & Tyler, 2012). The heterogeneity in documented evaluation effects raises questions about how the specific features of evaluation systems contribute to their success or failure at impacting student learning, but often it is difficult to study specific design features as opposed to the sum effects of the evaluation system as a whole.

In this study, we leverage a policy change to one component of Chile’s national teacher evaluation system—the frequency of evaluation for lower-performing teachers—to estimate its causal effect on student outcomes. Starting in 2005, most public school teachers in Chile were required by law to be evaluated every four years through the country’s comprehensive evaluation system that included videotaped lessons, work portfolios, interviews, and administrator ratings. In 2011, this law was changed such that teachers who did not achieve at least a rating of competent (level 3 out of 4) had to be evaluated again two years later. We ask how requiring teachers to undergo more frequent evaluation impacts student learning, teacher effort, and teacher beliefs in the years following assignment to re-evaluation.  

To estimate the policy’s effects, we use administrative data from the years 2005-2015 for the universe of Chilean primary school students and teachers that includes teachers’ evaluation scores, standardized test scores for fourth-grade students, and surveys about teacher beliefs and behaviors from students, parents, and teachers themselves. Because the policy requiring re-evaluation is imperfectly observed, we employ a fuzzy difference-in-differences design that compares the outcomes of students assigned to a teacher rated above and below the competent threshold before the policy to the difference in outcomes for students assigned to teachers above and below this threshold after the re-evaluation policy is implemented. 

We find that the re-evaluation policy did not increase student learning in language or math, both in the year of re-evaluation and the year afterwards. In contrast, we find decreases in teaching practices as rated by students in both years (0.22 and 0.30 standard deviations). Moreover, in the year teachers are informed they will need to be re-evaluated, parents rate teachers as less caring (0.24 standard deviations). Our estimates are robust to considerations of student sorting, differential teacher attrition, and alternative model specifications. 

Together, the results suggest that requiring teachers who do not meet standards to undergo re-evaluation may not result in improved student outcomes. In part, this appears to be because the re-evaluation does not induce improvements in teaching practices or beliefs. By leveraging variation in a specific design feature of a performance evaluation system operating at a national scale, we offer novel causal evidence that can inform efforts to design or improve evaluation systems for teachers and other employees.

Authors