Don't Get Duped: Analyzing Near Duplicates in Survey Data

Information Menu
Search Tips
ASC Home

Navigation and Settings Menu
Sign In

Social Media Menu
Facebook
X (Twitter)

Back Home

Refresh: Off View Personal Schedule

Individual Submission Summary

Share...

Direct link:

Don't Get Duped: Analyzing Near Duplicates in Survey Data

In Event: Issues and Advances in Measurement and Design

Wed, Nov 13, 3:30 to 4:50pm, Salon 5 - Lower B2 Level

Abstract

Research fraud has increasingly garnered attention within the broader scientific community and criminology specifically. Here, we adopt recently advocated approaches to identifying potentially fraudulent survey data through pairwise comparisons of substantive variables between each observation in the data to detect “near duplicate” responses. We demonstrate these methods across four criminological surveys collected in different international locations. Preliminary results suggest that typical features of self-report crime data (e.g., skewed distributions) are likely to produce substantial near duplicate cases due to non-fraudulent data generating processes. Though these "forensic" methods may not provide conclusive evidence for or against data falsification, we recommend routinely adopting them into criminological data analysis workflows. They can detect obvious instances of data falsification, identify potentially problematic cases, and generally improve understanding of important features of one's data.

Don't Get Duped: Analyzing Near Duplicates in Survey Data

Abstract

Authors