Paper Summary
Share...

Direct link:

A General Two-Part Mixture Modeling for Semi-Continuous Response Variables

Fri, April 12, 4:55 to 6:25pm, Philadelphia Marriott Downtown, Floor: Level 4, Room 403

Abstract

This paper proposes a novel two-part mixture model to deal with the challenges of working with observed or latent semi-continuous response variables in a latent profile analysis (LPA). Traditional LPAs assume latent class indicators are continuous variables with a multivariate normal distribution within each latent class. However, extreme violations of normality, particularly with strong floor and ceiling effects, can affect the accuracy of latent class formation. This paper proposes a more flexible approach to handle such scenarios.

In mathematical terms, continuous random variables (RVs) are unbounded, with a probability density function (PDF) spanning the real number line. In contrast, limited continuous RVs have a bounded PDFs within defined intervals. In practical terms, most, if not all, of the continuous RVs we observe are limited due to natural or physical boundaries, theoretical restrictions, and the constraints of our measurements. This paper focuses on semi-continuous RVs, which are usually limited and have the additional distributional feature of one or more discrete point masses.

Two-part models (TPMs) are widely used for analyzing semi-continuous response variables, mainly focusing on ratio variables with a point mass at zero and a limited continuous distribution of positive values. (e.g., Olsen & Schafer, 2001). “Part one” characterizes binary occurrence (zero or non-zero) and “Part two” characterizes the non-zero continuous process. The two parts are stochastically distinct but correlated and estimated simultaneously. Our proposed model address three challenges limiting the applicability of current TPMs: (1) handling multiple point masses; (2) addressing floor and ceiling effects; and (3) dealing with interval scales without a "true zero" such as those generated from factor analysis and IRT models.

To illustrate the model, data from a school-based study on discrimination and adolescent well-being are used. Students reported on lifetime experiences of peer-perpetrated discrimination and bullying due to four identity characteristics using an adapted version of the Adolescent Discrimination Distress Index (Fisher, Wallace, & Fenton, 2000). The first aim of the study was to use mixture modeling to characterize intersectional experiences of discrimination (e.g., co-occurring racism and classism), similar to the approach used by Garnett et al. (2014), but with multiple measures of discrimination within each specified aspect of identity. The distribution of all four of identity attribution scale scores evinced a strong floor effect with a prominent negative point mass, corresponding to no attributable discrimination, as well as a slightly smaller positive point mass corresponding to experiencing occasional name-calling. We specify a two-part mixture model with an LCA for modeling location on both point masses and an LPA for modeling heterogeneity in limited continuous identity attribution scores, allowing the two latent class variables to covary in the full model. We compare our findings to the results of three alternate approaches: (1) the two-part factor mixture model of Kim and Muthén (2009); (2) LPA using censored-inflated normal distributions within class; and (3) LPA using the default multivariate normal distribution within class. Our proposed model along with the comparative models were all estimated in Mplus, V8.10 (Muthén & Muthén, 2023).

Authors