Paper Summary
Share...

Direct link:

Raters’ Perspectives: Elevating the Voices of Raters in Classroom Observation Research

Thu, April 24, 9:50 to 11:20am MDT (9:50 to 11:20am MDT), The Colorado Convention Center, Floor: Meeting Room Level, Room 709

Abstract

Purpose
The purpose of this study was to examine raters’ role in the development process of the AR. While researchers have attended to validity when developing classroom observation instruments (e.g., Author, 2014; Gleason et al., 2017), we have found no reports that include the intentional elevation of raters’ voices in the instrument development process. This study, guided by the following research question, aims to address this gap: In the context of cognitive interviews conducted at multiple time points across the development of the AR, what are lessons learned from raters?

Perspectives
The Standards for Educational & Psychological Testing (AERA, APA, & NCME, 2014) outline different sources of validity evidence to build a case for how and why valid inferences can be drawn from an instrument’s resulting scores. One of these sources is response process evidence, an example being cognitive interviews with raters to evaluate if scoring is applied accurately and consistently (Willis, 2004) and to illuminate factors that could be impacting the scoring process.

Data and Methods
Participants in this study were four raters, one undergraduate and three graduate students, who engaged in applying AR scores to video-recorded mathematics lessons in grades 3-8. Cognitive interviews with the raters were conducted over one year at four strategic timepoints based on the timeline for development of the AR. Prior to participating in an interview, raters watched and scored a common lesson using the AR. During the interview, raters shared their scores and score justifications, describing specific lesson evidence. They also shared their general processes for scoring and what they were unsure about relative to each rubric. After each timepoint, interviews were analyzed to inform next steps in the AR development process.

Findings
Two primary findings resulted from the iterative implementation of cognitive interviews. First, raters raised questions related to lesson evidence for three different components of the classroom observation system: (1) a rubric itself (i.e., “Does this evidence land on or align with this rubric?”); (2) rubric levels (i.e., “Does this evidence warrant moving up a level on the rubric?”); and (3) terms on a rubric (e.g., “Does this evidence count as a rationale?”). Second, raters provided insight on AR support materials and processes. They shared examples from the scoring manual that were helpful and that needed improvement. Additionally, their descriptions of how they approached notetaking while watching a lesson signaled the need to formalize a notetaking process across raters.

Significance
Raters in this study played a significant role in informing the development of the AR observation system. Because raters talked out loud about their interpretations of rubrics, terms, and levels, their voices were critical in the development of follow-up targeted training, the structure of future drift meetings, and revisions to the initial training materials. Furthermore, cognitive interviews informed improvements to scoring processes. Without the iterative cognitive interviews, the AR observation system would be less robust and rigorous, signaling the significance of the voices of the raters in the instrument development process.

Authors