Paper Summary
Share...

Direct link:

Using Generative Artificial Intelligence Tools to Develop Multiple-Choice Assessment Items: An Effectiveness Study

Sat, April 26, 3:20 to 4:50pm MDT (3:20 to 4:50pm MDT), The Colorado Convention Center, Floor: Meeting Room Level, Room 705

Abstract

Generative artificial intelligence (GenAI) tools developed to support teaching are widely available. Trustworthiness concerns have prompted calls for researchers to study their effectiveness. We investigated one type of GenAI created to support teachers: multiple-choice question generators (MCQ GenAI). Among the nine MCQ GenAI tools investigated in this study, one indicated teacher involvement, and none mentioned testing experts in development processes. MCQ GenAI-created items (n=90) were coded based on MCQ item-writing guidelines. Results showed 84.44% of items violated at least one guideline with 77.78% likely to produce major measurement error (should not use without revision), 6.67% likely to elicit minor measurement error (consider modifying), and 15.56% acceptable (usable as created). Implications suggest multidisciplinary teams are needed in educational GenAI tool development.

Authors