Paper Summary
Share...

Direct link:

Evaluation of GPT-generated Readability Scores

Sun, April 12, 7:45 to 9:15am PDT (7:45 to 9:15am PDT), JW Marriott Los Angeles L.A. LIVE, Floor: Gold Level, Gold 3

Abstract

GPT is increasingly used in educational settings to create reading materials tailored to specific grade levels. However, the accuracy of GPT-generated readability scores has not yet been systematically evaluated. This study assesses GPT’s effectiveness in readability evaluation by comparing its grade-level predictions with traditional readability metrics across three genres: language arts, social studies, and science. The results revealed moderate to strong correlations between GPT-generated scores and traditional metrics, with the strongest alignment observed in science texts, followed by social studies, and the weakest in language arts. Additionally, zero-shot prompting performed comparably to few-shot prompting in estimating readability. These findings underscore GPT’s potential for automated readability assessment and offer practical implications for leveraging generative AI in educational contexts.

Authors