Paper Summary
Share...

Direct link:

Monitoring the Performance of Human and Automated Scores for Spoken Responses

Fri, April 17, 4:05 to 5:35pm, Marriott, Floor: Fourth Level, Belmont

Abstract

Scoring models for the SpeechRater system were built and evaluated for a Speaking section in an English language assessment. Monitoring charts such as Shewhart control charts and evaluation statistics such as percent agreement, weighted kappa, Pearson correlations, standardized differences in mean scores, and bias were used to evaluate the SpeechRater model performance against human scores. Performance was also evaluated across different native language groups.

Authors