Session Submission Summary

Direct link:

Applications of Topic Modeling in Communication Research: Potentials and Pitfalls

Fri, May 26, 8:00 to 9:15, Hilton San Diego Bayfront, Floor: 3, Aqua 305

Session Submission Type: Panel


Quantitative content analysis is one of the most central and widely used methods to analyze media content and other forms of written communication (Kamhawi & Weaver, 2003). Previously, as media coverage was limited to traditional outlets, the common research practice was to analyze a representative sample using a manual content analysis to detect important topics, positions, frames or actors. However, the digital revolution and in particular the advent of the internet have not only dramatically changed the object of communication research in terms of an expanding public sphere, but even more so the ways in which communicative content is accessed and analytically processed. To be sure, the accessibility of online communication offers new possibilities for communication researchers, but importantly it also challenges some of the established research methods of our discipline, as these quickly run up against their limits when confronted with extensive amounts of data (Günther & Quandt, 2015; Guo et al., 2016).
Against the background of this development, automated methods of content analysis offer greater analytic capacity. Particularly promising in this respect are recent efforts in the area of Bayesian statistical approaches of topic modeling, above all Latent Dirichlet Allocation (LDA) (Blei, Ng & Jordan, 2003), as they detect thematic patterns in large collections of text inductively and hence are able to uncover the hidden structure of topics within a corpus (Blei, 2012). Yet, the ability of topic modeling approaches to process large amounts of text notwithstanding, their relative novelty means that their application in communication research still largely represents uncharted territory. The present panel has therefore gathered contributions that address the theoretical premises of topic modelling (contribution 1), examine methodological issues (contribution 2) and present empirical applications of the LDA approach to large text analysis (contributions 3 and 4). It is for instance unclear how the results of LDA relate to central communication concepts such as “issues”. Methodologically, the panel shows that we can seize on the advantages of LDA’s inductive logic to help us map the topics of domains unfamiliar to the researcher. Finally, the empirical applications explore the value of LDA models in assessing the topic dynamics between news media and users’ online comments as well as how different forms of homophily structure the online space. Taken together, these perspectives allow us to better understand the advantages but also the blind spots and drawbacks of employing a topic modelling perspective in contrast to traditional methods.

Sub Unit


Individual Presentations