
Search

Browse By Day

Browse By Time

Browse By Person

Browse By Area

Browse By Session Type
Search Tips
ASC Home

Sign In


X (Twitter)
This study evaluates the potential of large language models (LLMs) to simulate outcomes of experimental research in the field of criminal justice. We constructed a comprehensive archive of survey experiments published in the Journal of Experimental Criminology, encompassing diverse populations and research designs. Using this dataset, we prompted four advanced, publicly available LLMs (GPT-4o, Grok 3, Gemini 2.0 Flash, and Claude 3.7 Sonnet) to predict how representative samples of Americans would respond to various experimental stimuli. Simulated treatment effects showed a moderate correlation with actual outcomes (r = 0.70) in simple designs targeting general populations. However, predictive accuracy declined for experiments involving more complex designs (e.g., conjoint analysis) or specific populations (e.g., college students). We also examined variation in predictive performance across demographic subgroups and topical domains. Our findings suggest that LLMs can complement experimental methods in criminological research, but they also underscore limitations and risks associated with relying on AI-generated predictions.