Paper Summary
Share...

Direct link:

Exploring the Use of LLMs for Assessing Creativity in Student Programming Artifacts

Sat, April 11, 7:45 to 9:15am PDT (7:45 to 9:15am PDT), JW Marriott Los Angeles L.A. LIVE, Floor: Gold Level, Gold 3

Abstract

This study explores GPT-4o’s ability to assess creativity in 383 student-generated music programs. Human experts rated the artifacts on four dimensions. Among three prompting strategies, the theory-driven few-shot (ECD) approach achieved the best accuracy (MAE = 0.582, 82.4% within ±1). Zero-shot prompting showed the highest correlation with human scores. All LLM-based methods outperformed rule-based baselines, demonstrating their potential for scalable, interpretable creativity assessment in K–12 computing.

Authors