Paper Summary
Share...

Direct link:

Innovations in the Analysis of Trend Data With Diverse International Test-Taker Populations

Mon, April 11, 10:00 to 11:30am, Marriott Marquis, Floor: Level Two, Marquis Salon 3

Abstract

International large-scale assessments (ILSAs) are administered cyclically to obtain data indicators to investigate education quality and effectives and for program monitoring. Despite these goals, unintended consequences may ensue by neglecting to interpret the contextual factors (e.g., differences in socioeconomic status, differential ratios of rural versus urban populations or opportunity to learn the curriculum or the technology used to collect data from the test takers) underlying the data. Alternatively, they may also occur by neglecting to analyze sources of differential item performance across subgroups or participating countries. As some of these factors may be irrelevant to the assessed construct (e.g., linguistic distance to the test development language, item content, format, or type), their analysis is important for accurate interpretation and subsequent data-based decisions. The complexity of those analyses may be exacerbated with population heterogeneity given the increased diversity of the test-taker population. Such considerations are important not only for each assessment cycle but also across test cycles. In the latter, neglecting to analyze such factors may confound our ability to accurately interpret the effectiveness of the educational policies implemented across cycles due to complexities in disentangling differences due to construct-irrelevant variance versus actual differences in test-takers’ performance. This issue is heightened in the case of some countries having more misfitting items than others.
The purpose of our presentation is to demonstrate our analysis of proportion of misfitting items across 30 countries participating in two administrations of the Programme for International Study Assessment (PISA): PISA 2006 and 2009. Second, we describe the use of a new algorithm which yielded reductions in item misfit across participating countries and can help address issues of misfit in the analysis of trend data. We also examine the sources of misfitting items (e.g., item type, item format, and item content considerations) to inform suggestions for test design and data analysis improvements in the context of analyzing trend data with diverse populations. These analyses and subsequent interpretations are important given the widespread use of international assessments to inform policy and educational interventions.

Authors