Paper Summary
Share...

Direct link:

Validation and Calibration in Mixture Modeling: The Exploratory Factor Analysis/Confirmatory Factor Analysis of Mixture Models

Sun, April 19, 8:15 to 10:15am, Virtual Room

Abstract

Mixture modeling is considered an exploratory modeling approach since, in most modeling contexts, the number and type of classes are not a priori hypothesized. As the use of mixture models becomes more common and with the availability of larger datasets and accessible open source datasets containing the same measures, a confirmatory mixture modeling approach becomes more plausible. Exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) are widely used in educational research to explore (EFA) and then confirm (CFA) modeling results. However, a less widely used, yet similar, approach in the context of mixture models is referred to as calibration and validation. There are many reasons this method may be underutilized, one of which we believe to be a misunderstanding of how and when to use it-- which this paper hopes to directly address.
The calibration/validation (C/V; c.f., Masyn, 2013) method parallels the EFA/CFA idea by providing an opportunity to validate the solution of a mixture model. Similar to EFA/CFA, you first fit an exploratory mixture model and decide on the number of classes. Then, a second subsample is used to validate the findings of the exploratory stage. Model fit is compared to evaluate the plausibility of the emergent class solution across the two samples. The C/V approach allows researchers to further examine their data and have more confidence to choose the best solution when fit statistics don’t point to a best model.
The C/V approach consisted of the following steps: (a) the sample was randomly split into two equivalent groups (samples A and B), (b) class enumeration was performed on sample A (calibration stage), (c) parameter estimates from sample A were used as fixed values in sample B (sample Bfixed; validation state) LCA, (d) the equivalence of the parameter estimates between sample A and B were tested by a likelihood ratio test (LR), (e) sample A was compared to a freely estimated sample B, and (f) the process was repeated with the samples reversed and again tested for differences.
To illustrate this approach, we used seven binary bullying items in a latent class analysis. Data was from the School Crime Supplement (SCS) of the National Crime Victimization Survey (NCVS; 2007). The sample consisted of 2,274, 6th, 7th, and 8th grade students. Class enumeration resulted in a 3-class solution: low/no bullying class (66%), moderate-bullying class (28%), and high-bullying class (6%). The calibration/validation LR results between sample A (LL = –2232.41) and sample Bfixed (LL = -2226.78) were non-significant (df = 23, p = .98) indicating that the 3-class model from sample A replicated in sample Bfixed. The other comparisons were significant (p < .001), thus not supporting validation of the classes. There was strong evidence that supported retaining the 3-class model. Implications of results will be discussed, as well as the use of the C/V methods in a wider range of mixture modeling. We will present detailed steps illustrating how the C/V approach can be used in other modeling contexts.

Authors