Paper Summary
Share...

Direct link:

Developing Standards-Aligned Assessment Content with AI: Challenges and Early Lessons

Sat, April 11, 3:45 to 5:15pm PDT (3:45 to 5:15pm PDT), InterContinental Los Angeles Downtown, Floor: 5th Floor, Hancock Park West

Abstract

As generative AI enters assessment development, many anticipate gains in efficiency and innovation—but questions of fairness remain. This session examines WestEd’s work using customized and fine-tuned large language models to create assessment passages, revealing both progress and persistent challenges. A comparative study of baseline and fine-tuned GPT models shows that light customization alone fails to address readability, grade level, and cultural relevance. Fine-tuning, paired with human review and rubric-based evaluation, yields measurable improvement, yet limitations persist. The session probes whose voices AI preserves, how engagement is defined, and why human judgment remains essential. Attendees will gain practical insights for implementing, evaluating, and governing AI responsibly within assessment design and policy decision-making. (120 words)

Author