Search
Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Session Type
Personal Schedule
Sign In
Access for All
Exhibit Hall
Hotels
WiFi
Search Tips
Sociologists increasingly employ large language models (LLMs) to simulate human subjects and predict life outcomes, but it remains unclear whether LLM reasoning reflects sociological reasoning, which integrates multilevel factors and temporal dynamics. Using three reasoning LLMs (DeepSeek-R1, QwQ-32B, GPT-OSS-120B) on 1,949 cases from the Fragile Families and Child Wellbeing Study, we analyzed predictions of age-22 educational attainment and aspirations.
We uncover a homogenization bias: LLMs compress outcome variation toward modal categories, underestimating extremes and intergenerational mobility. Trace analysis identifies four mechanisms driving this compression: cognitive narrowing (reducing factor diversity), semantic proximity bias (favoring outcome-adjacent variables), temporal recency bias (overweighting recent data), and risk aversion (favoring moderate predictions). Collectively, these tendencies flatten multivariate and temporal complexity, shifting predictions away from heterogeneity. Non-reasoning models display similar patterns, suggesting this bias may be architecture-wide.
Our results indicate that LLMs follow specific types of inferential pathways that surprisingly simplify reasoning rather than accounting for the complexity of the life course, underscoring the need to audit inferential dynamics beyond accuracy when integrating LLMs into social science research.