NCME Annual Meeting: Large Language Models in Assessment: Guidelines, Characteristics and Biases

Information Menu
Search Tips

Navigation and Settings Menu
Personal Schedule
Sign In

Social Media Menu
Facebook
X (Twitter)

Back Home

Refresh: Off View Personal Schedule

Session Summary

Share...

Direct link:

Add to Personal Schedule

Large Language Models in Assessment: Guidelines, Characteristics and Biases

Sun, April 14, 3:05 to 4:35pm, Convention Center, Floor: First, 121B

Session Type: Coordinated Paper Session

Abstract

In the rapidly evolving landscape of education, testing and assessment, AI-powered large language models (LLMs), such as ChatGPT, are becoming increasingly prevalent. This coordinated session aims to explore various aspects of LLMs that help to better understand their properties, strengths, and weaknesses. While the focus is on writing assessments, we conjecture that many of the presented findings will generalize to other areas as well. The session features four talks: The first two talks explore the linguistic characteristics and overall quality of AI-generated essays. The first talk studies the difference between several state-of-the art LLMs, both proprietary and open source, while the second talk focuses on the effect of the sampling temperature – a simple yet consequential parameter. The remaining two talks focus on detection of AI-generated essays. The third talk provides guidelines and discusses best practices for the use of such detectors, while the fourth talk investigates the issues of potential biases detectors might show against or in favor of non-native speakers.

Sub Unit

Individual Papers, Innovation Demonstrations, Coordinated Sessions, or Organized Discussions

Session Organizer

Michael Fauss, ETS

Large Language Models in Assessment: Guidelines, Characteristics and Biases

Abstract

Sub Unit

Session Organizer

Individual Presentations