Individual Submission Summary
Share...

Direct link:

Beyond the Cluster Default: A Decision Framework for Unsupervised Classification

Tue, August 11, 10:00 to 11:30am, TBA

Abstract

Unsupervised classification methods are ubiquitous in social science, often used to reduce complex attitudinal data into interpretable typologies. However, researchers frequently default to specific algorithms without explicitly testing whether the underlying data structure supports a discrete partitioning. This "clustering default" risks generating false positives: identifying qualitative distinctions where none exist. This article introduces a systematic decision framework to compare the outputs of multiple classification algorithms (LCA, K-means, Hierarchical, and Density-based) using a range of internal validity indices. We demonstrate the utility of this framework by revisiting a debate on American nationalist sentiment (Bonikowski and DiMaggio 2016). While the original study identified four distinct nationalism "varieties," our framework reveals that the data do not support a stable partition, suggesting that the identified clusters are methodological artifacts. We conclude that rigorous model selection is required to distinguish between genuine social heterogeneity and artificial segmentation.

Authors