Individual Submission Summary
Share...

Direct link:

A Test of “Donor” Behavior in Simulated Experiments Using ChatGPT: Implications, promises, and threats for nonprofit scholarship

Fri, July 19, 11:00am to 12:30pm, TBA

Abstract

In recent years, there has been a noticeable rise in the utilization of machine learning tools such as natural language models to enhance research in the field of social science (Argyle et al., 2023), including nonprofit research (e.g., Lovejoy and Saxton, 2012; Pandey and Pandey, 2019). However, the full range of possibilities for utilizing large-scale generative language models like ChatGPT in the realm of social science research remains largely uncharted. Given that ChatGPT's training incorporated cultural knowledge and conversational nuances found in authentic human language as used by individuals (Adiwardana et al. 2020; Radford et al. 2019), we propose that these generative language models can be harnessed to replicate human social and political behaviors, thus serving as a valuable instrument in survey research.

Argyle and colleagues (2023) suggest that while language model biases are widely considered problematic, they posit that by properly conditioning a model with specific identity and demographic information, researchers can replicate responses from specific human subgroups. They discovered that when adequately conditioned, language models, especially GPT-3, can generate outputs that closely mirror human response patterns, including predicting voting behavior (Argyle et al., 2023).

In this study, we test the algorithmic fidelity and the applicability of generative language models in assisting survey research in nonprofit studies, with implications for social science research more generally. We specifically employ GPT-4 to produce a synthetic dataset of 1,000 simulated responses representative of US adults in October 2020. We use these simulated data to reproduce an actual survey experiment conducted by the authors using real US adult respondents in that same time period. The full survey includes sets of questions on demographic characteristics, voting behavior, political preferences, donation and volunteering behavior, and other questions replicated directly from large surveys of the population from the US Census and General Social Survey. We condition GPT-4 to provide a full set of representative “responses” to the full survey, and we replicate the experimental design by assigning each treatment to separate ChatGPT chats. We use this simulation to test how closely the survey experiment findings (on donation behavioral responses to organizational vignettes) are replicated in this artificial setting and whether more nuanced findings in the experiment, representing more complex interactions between demographic characteristics of respondents and experimental treatments on donor behavior, are also upheld. Our findings show that, with appropriate utilization, AI language models can aid researchers in the formulation of survey questions, refinement of experimental treatments, and full tests of analytical methods for data analysis, thereby facilitating and piloting human research with significantly reduced costs. We will discuss the potential implications for nonprofit researchers and for social science more generally, including tradeoffs and potential threats to the field with the use of AI tools in research – and how nonprofit scholars might approach the use of these tools more broadly in academia.

References

Adiwardana, D., Luong, M. T., So, D. R., Hall, J., Fiedel, N., Thoppilan, R., ... & Le, Q. V. (2020). Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977.

Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., & Wingate, D. (2023). Out of one, many: Using language models to simulate human samples. Political Analysis, 31(3), 337-351.

Lovejoy, K., & Saxton, G. D. (2012). Information, Community, and Action: How Nonprofit Organizations Use Social Media. Journal of Computer-Mediated Communication, 17(3), 337–353.

Pandey, S., & Pandey, S. K. (2019). Applying Natural Language Processing Capabilities in Computerized Textual Analysis to Measure Organizational Culture. Organizational Research Methods, 22(3), 765–797.

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI blog, 1(8), 9.

Authors