Individual Submission Summary
Share...

Direct link:

Welcome to the Inferential Multiverse!

Tue, August 11, 8:00 to 9:30am, TBA

Abstract

Insofar as it is oriented toward producing a cumulative body of knowledge about the social world, quantitative sociology is fundamentally a collective endeavor. In this context, substantial between-study variation in the value of a parameter estimate is a source of concern because it signals a lack of scientific consensus within the sociological community. With this problem in mind, researchers have increasingly turned to multimodel methods such as multiverse analysis and specification curve analysis. Proponents of these approaches argue that the best way for individual research teams to proactively account for researcher degrees of freedom is to put ourselves in the shoes of our hypothetical colleagues and estimating a separate model for each combination of analytical choices in the garden of forking paths that connects our raw data to the set of feasible point estimates that make up a given multiverse or specification curve. While multimodel methods have been used to good effect to describe the extent to which a result depends on the analytical choices of the researcher, we still lack a fully integrated framework for efficient multimodel inference. More specifically, we lack a parametric framework that would allow us to bring together information on between-model variation and sampling variance in a coherent and statistically principled way. Using real-world data analysis and simulation, I show that this problem is readily solved by combining influence regression with seemingly unrelated estimation to produce parametric tests that allow us to make inferences about the location, scale, and dispersion structure of a multiverse or specification curve while simultaneously correcting for dependence in the underlying parameter estimates. The advantage of this approach is that it allows for meaningful multimodel inference without relying on bootstrap procedures, thereby avoiding the substantial computational burden that comes with repeatedly estimating a separate model for each combination of analytical choices.

Author