AERA Annual Meeting: Challenges with Identification and Estimation of Cognitive Diagnostic Models

Information Menu
Search Tips

Navigation and Settings Menu
Change Preferences / Time Zone
Sign In

Back Home

Refresh: Off

Paper Summary

Share...

Direct link:

Challenges with Identification and Estimation of Cognitive Diagnostic Models

In Event: Advancements and Challenges in Cognitive Diagnostic Models and Mixture Modeling

Thu, April 24, 5:25 to 6:55pm MDT (5:25 to 6:55pm MDT), The Colorado Convention Center, Floor: Meeting Room Level, Room 302

Abstract

Purpose
Cognitive Diagnostic Models (CDM) are restricted latent class models used to infer specific students' attributes/skills based on test performance. Like other latent class models, CDMs frequently encounter challenges with non-identifiability. However, even when models can be identified, they may still yield parameters that are not reliably interpretable. Models with invalid and unreliable parameters can significantly distort understanding of the complex phenomena under investigation, potentially leading to incorrect conclusions. This study investigates whether parameter estimates remain consistent across variations in skill prevalence, as well as in the guessing and slipping parameters.

Theoretical Framework
Cognitive Diagnostic Models (CDMs) are widely used in education research to identify specific skill sets demonstrated by students through assessment tasks (De La Torre & Douglas, 2004; Li et al., 2016; Ravand & Robitzsch, 2019; Templin & Henson, 2006). These models link skills to items via a Q-matrix, with the DINA model requiring all necessary skills for task success. Gu and Xu (2018, 2019) provided conditions for model identification based on the Q-matrix. Despite this, CDMs may face empirical underidentification, meaning they are technically identified but difficult to fit to data.
Methods/Data Sources
Building on Gu and Xu (2019), this study simulates real-world conditions to test if parameter estimates remain consistent across variations in skill prevalence, guessing, and slipping parameters. Gu and Xu used 6 items with a Q-matrix meeting minimal conditions, showing consistency with a small simulation of equal skill prevalence (p = 0.5). We replicate their simulation and confirm their results. Additionally, we run a similar simulation with unequal skill attribute prevalences, using the same Q matrix. Simulation 1 confirms Gu and Xu’s findings, while Simulation 2 examines the impact of unequal class probabilities, with 1000 repetitions and a sample size of N = 800.
Results
For Simulation 1, under the same conditions as Gu and Xu (2019), CDM models fit well. Figure 1 shows pairwise scatter plots of model parameters with consistent estimates close to generating values. However, with unequal skill prevalences, model fitting was less consistent. Figure 2 shows a scatterplot of the guessing parameter for Item 2 and the 010 attribute class, with estimates pulled towards two non-generating solutions. Figure 3’s density plot of the guessing parameter reveals bimodal solutions, rarely converging at the generating values. Although formally identified, the model is empirically unidentified, requiring very large sample sizes for consistent estimation.

Significance

Even when the minimal conditions on the Q matrix are met for identification of a CDM, the results of fitting the models can vary heavily depending on the specific parameter configurations. In our example, with equal skill prevalences, the model fitting was consistent and reliable. With unequal skill prevalences, and thus unequal class sizes, the simulation results under repeated sampling seemed to have a bimodal convergence pattern. Difficulties in fitting CDMs are common, and our work will help interpret parameter estimates to better understand conditions under which consistent estimation is and is not expected.

Challenges with Identification and Estimation of Cognitive Diagnostic Models

Abstract

Authors