AERA Annual Meeting: Evaluation of GPT-generated Readability Scores

Information Menu
Search Tips

Navigation and Settings Menu
Change Preferences / Time Zone
Sign In

Back Home

Refresh: Off

Paper Summary

Share...

Direct link:

Evaluation of GPT-generated Readability Scores

In Event: AERA Roundtable Session Sunday 7:45 am - Gold 3
In Roundtable Session: Exploring the Impact of AI and Technology on Literacy Instruction

Sun, April 12, 7:45 to 9:15am PDT (7:45 to 9:15am PDT), JW Marriott Los Angeles L.A. LIVE, Floor: Gold Level, Gold 3

Abstract

GPT is increasingly used in educational settings to create reading materials tailored to specific grade levels. However, the accuracy of GPT-generated readability scores has not yet been systematically evaluated. This study assesses GPT’s effectiveness in readability evaluation by comparing its grade-level predictions with traditional readability metrics across three genres: language arts, social studies, and science. The results revealed moderate to strong correlations between GPT-generated scores and traditional metrics, with the strongest alignment observed in science texts, followed by social studies, and the weakest in language arts. Additionally, zero-shot prompting performed comparably to few-shot prompting in estimating readability. These findings underscore GPT’s potential for automated readability assessment and offer practical implications for leveraging generative AI in educational contexts.

Evaluation of GPT-generated Readability Scores

Abstract

Authors