Individual Submission Summary
Share...

Direct link:

But How Do We Store It? (Big) Data Architecture in the Social-Scientific Research Process

Fri, May 26, 12:30 to 13:45, Hilton San Diego Bayfront, Floor: 4 (Sapphire), Exhibit Hall - Rear

Abstract

With datasets that are growing in size and complexity, communication researchers are faced with a new problem: How do you manage these data? While methodological literature has addressed questions of collection and analyzing large-scale datasets of media data, the step in between – the systematic storage – has been largely neglected, as has been the question where to situate preprocessing within the workflow. In this paper, we situate such considerations within the social-scientific research process and offer guidelines for deciding on a suitable data architecture. We show decisions on how to store the data influence later analysis. In particular, we discuss how the choice for schema-oriented databases with a fixed tabular structure versus the choice for schema-less databases that store documents without making assumptions about their data strucutre, can play out in a communication science research context.

Authors