Individual Submission Summary
Share...

Direct link:

Survey as Life-course Annotation: Extracting Life Histories from Survey Data Using Large Language Models

Tue, August 11, 10:00 to 11:30am, TBA

Abstract

Previous literature in historical sociology critiques that quantitative social science has long relied on generalized linear models, a methodological approach that reduces complex social processes into a simple linear transformation of variables. To bridge the gap between historical sociology’s emphasis on social processes and methodological toolkit available to sociologists, I propose “survey as life-course annotation.” This approach conceptualizes survey participation as an individual’s self-annotation of their life history, which generates survey question–response label pairs in natural language. Using this text data and large language models (LLMs), I extract and construct the life-course narrative of each survey respondent. Data was drawn from the Future of Families and Child Wellbeing Study (FFCWS). Gemini-3-Pro was used to process an average of 3,229 variables per child to construct life-course narratives and social-event sequence trees. I discuss example outputs from the model and their implication for sociological methodology. Ultimately, this framework can offer a narrative-based understanding of individuals within large-scale datasets.

Author