Search
On-Site Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Change Preferences / Time Zone
Sign In
Bluesky
Threads
X (Twitter)
YouTube
This study explores GPT-4o’s ability to assess creativity in 383 student-generated music programs. Human experts rated the artifacts on four dimensions. Among three prompting strategies, the theory-driven few-shot (ECD) approach achieved the best accuracy (MAE = 0.582, 82.4% within ±1). Zero-shot prompting showed the highest correlation with human scores. All LLM-based methods outperformed rule-based baselines, demonstrating their potential for scalable, interpretable creativity assessment in K–12 computing.