Search
Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Session Type
Personal Schedule
Sign In
Access for All
Exhibit Hall
Hotels
WiFi
Search Tips
Large language models (LLMs) are discussed as a major innovation in
the human-computer interface. Proposals for their use range from
enabling non-programmers to code their own applications, doctors
using LLMs to aid in clinical transcription, the use of LLMs as an
artificial survey respondent, to its use in military and
intelligence settings. Yet LLMs are innately ``culture machines'',
being trained to produce output (such as text) on a corpus of
cultural materials. If a ``culture machine'' were to perform an
alien culture rather than the one a user expects, miscommunication
might result, with potentially severe consequences. It is an open
question as to how well LLMs perform any human culture.
To study this question, we use the EPA cognitive dimensions
introduced in Osgood et al. (1957). These are intended to
measure human cultures by assessing where they place terms on three
universal EPA dimensions: evaluation (good/bad), potency
(powerful/weak), and activity (lively/still). Human cultures may
place various terms in different locations within each dimension,
but they all make use of these three dimensions. By examining how
similarly LLMs place terms to members of human cultures, we have a
means of comparing both. Using historical EPA ratings from many
time periods (1980-2015) and countries (covering North America,
Asia, Africa, and Europe), as well as LLM data collected in 2026
(covering ChatGPT, Deepseek, Claude, Grok, and Gemini),
we find that LLMs produce EPA ratings that are culturally distinct from human cultures
across time and space. The implications for this apparent mismatch are also discussed.