Paper Summary
Share...

Direct link:

Can ChatGPT Provide Useful Holistic Essay Scoring?

Sun, April 14, 3:05 to 4:35pm, Pennsylvania Convention Center, Floor: Level 100, Room 113B

Abstract

Researchers have sought for decades to automate holistic essay scoring in order to reduce the burden on teachers and provide timely feedback to students. Over the years, these programs have improved significantly and can reach substantial agreement with human raters. However, such accuracy requires significant amounts of training on human-scored texts -- reducing the expediency and usefulness of such programs for routine uses by teachers across the nation on non-standardized prompts. This study analyzes the output of multiple versions of ChatGPT scoring of secondary student essays from two extant corpora and compares it to quality human ratings. We find that the current iteration of ChatGPT fails to substantially compare to human ratings.

Authors