Paper Summary

Direct link:

A Comparison of the Generalizability of Scores Produced by Human Scorers and an Automated Scoring System

Thu, April 16, 12:00 to 1:30pm, Sheraton, Floor: Fourth Level, Chicago VI&VII


The use of constructed response items and the implementation of automated scoring in large-scale assessments have increased over recent years. This study compared scoring results between human raters and an automated scoring system named KASS, by reporting on generalizability analyses. The results of this study indicate that automated scoring system offers outcomes nearly as reliable as those produced by human scoring. Further inspection of the results reveals that the scores generated by the automated system vary across the item clusters. Upon completion of the study, a more comprehensive set of the results along with in-depth discussion will appear in the full paper.