Paper Summary
Share...

Direct link:

The Reliability and Validity of Teacher Scores Based on Student Learning Objectives

Fri, April 8, 12:00 to 1:30pm, Convention Center, Floor: Level Two, Room 209 B

Abstract

Purpose
Student learning objectives (SLOs) are currently being used in teacher evaluation systems in 30 states throughout the U.S. (Lacireno-Paquet et al., 2014). In each state, upwards of 70% of teachers will have an SLO score, counting towards as much as 50% of their evaluation. Yet, very little evidence exists to substantiate teacher SLO score inferences for high-stakes decisions (Harris, 2012; Slotnick et al., 2013). This study attempts to address this issue by examining the validity and reliability of teacher SLO scores from one Race to the Top (RTTT) state implementing SLOs as part of its teacher evaluation system.

Theoretical Framework
A comprehensive validity argument approach is beyond the scope of this paper. However, if teachers’ SLO scores are reflective of their underlying ability to help students learn, and assuming that teacher effectiveness is relatively stable over time and across subject areas (Loeb and Candelaria, 2013; Goldhaber and Hanson, 2012), then like with value-added scores, SLO scores should be fairly stable over time and across courses, converge with other metrics that purport to measure a teacher’s effectiveness based on student growth, and diverge from measures of classroom demographics (Hill et al., 2011; Bill and Melinda Gates Foundation, 2013).
Therefore, I ask:
1. What is the within-teacher across-year stability of teacher SLO scores?
2. What is the within-teacher across-course reliability of teacher SLO scores?
3. What is the relationship between teacher SLO scores and Mean Student Growth Percentile (MGP) scores for teachers who have both metrics?
4. What is the relationship between teacher SLO scores and classroom make-up?

Method
To address each research question, I examined:
a) the within-teacher across-year SLO correlation for teachers who have SLO scores from the same course in 2012-13 and 2013-14;
b) the within-teacher across-course SLO correlation for teachers who teach multiple courses in a given year;
c) the correlation between teacher SLO scores and MGP scores for teachers who have both; and
d) the correlation between a teacher’s SLO score and
i. the average classroom standardized prescore,
ii. the percentage of students within each classroom designated as ELL and SWD.

Data sources
My data consists of 2012-13 and 2013-14 statewide teacher SLO scores, classroom demographics (%ELL and %SWD), and MGP scores.

Results
I find that while teacher SLO scores are moderately stable across courses, they are not stable over time, likely due to changes made to the assessments and targets used to determine student SLO scores. Further, for teachers with both SLO and MGP scores, the two metrics are not related. Finally, teachers in courses with higher average student prescores and lower proportions of students with disabilities have slightly higher SLO scores. Results were similar to those found with value-added based metrics of teacher performance.

Scholarly Significance
The findings presented in this paper suggest that the assumptions underpinning the use of SLO scores for teacher evaluation need to be investigated well in advance of scores being used for high-stakes decisions. I suggest several mechanisms to consider in order to increase the validity of SLO scores.

Author