Search
On-Site Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Change Preferences / Time Zone
Sign In
Bluesky
Threads
X (Twitter)
YouTube
Large-scale assessments (LSAs) utilize calibrated items to evaluate students' performance across various abilities. A crucial step in this process is estimating item parameters, as their true values remain unknown. Item Response Theory (IRT) models are frequently employed, operating under the assumption that students’ latent ability distribution is normal. However, if this distribution is skewed, estimates can be biased (Finch & Edwards, 2016; Reise et al., 2018; Suh, 2015). The validity of this normality assumption is critical, especially since a total population may be normally distributed, while subpopulations sampled non-randomly may not (Sass et al., 2008).
In multistage adaptive tests (MST), students are non-randomly assigned to test modules based on their performance. For instance, in a 1-2-3 MST design, all students start with the same items, but subsequent stages allocate them to easier or harder modules based on their earlier responses, tailoring the test to their abilities (Yan et al., 2014).
This adaptive process means that ability distributions will differ from the total population: students in easier modules will have left-skewed distributions, while those in harder modules will be right-skewed, and average difficulty modules will exhibit leptokurtic distributions. Additionally, the variance in ability within each module is typically smaller than in the overall sample.
To explore how reduced ability variance and skewness affect item parameter recovery, we conducted a simulation study under various conditions: test lengths (21, 42, and 84 items), sample sizes (1,000, 2,000, and 4,000), and average item discrimination parameters (0.7 and 1.4), using both 1PL and 2PL IRT models. The degree of skewness or leptokurtosis was influenced by test length and discrimination parameters, with a deterministic allocation rule applied. Results from the simulations were then compared to findings from the 2024 central tests of Flanders, which utilize MST.