Individual Submission Summary
Share...

Direct link:

Poster #31 - Can Large Language Models Predict Experimental Results in Criminology? Evidence from Simulated Survey Responses

Thu, Nov 13, 7:30 to 8:30pm, Marquis Salon 5 - M2

Abstract

This study evaluates the potential of large language models (LLMs) to simulate outcomes of experimental research in the field of criminal justice. We constructed a comprehensive archive of survey experiments published in the Journal of Experimental Criminology, encompassing diverse populations and research designs. Using this dataset, we prompted four advanced, publicly available LLMs (GPT-4o, Grok 3, Gemini 2.0 Flash, and Claude 3.7 Sonnet) to predict how representative samples of Americans would respond to various experimental stimuli. Simulated treatment effects showed a moderate correlation with actual outcomes (r = 0.70) in simple designs targeting general populations. However, predictive accuracy declined for experiments involving more complex designs (e.g., conjoint analysis) or specific populations (e.g., college students). We also examined variation in predictive performance across demographic subgroups and topical domains. Our findings suggest that LLMs can complement experimental methods in criminological research, but they also underscore limitations and risks associated with relying on AI-generated predictions.

Authors