Paper Summary
Share...

Direct link:

A Synthesis of Validity Evidence Regarding Quantitative Mathematics-focused Teacher Education Instruments

Sun, April 27, 11:40am to 1:10pm MDT (11:40am to 1:10pm MDT), The Colorado Convention Center, Floor: Meeting Room Level, Room 712

Abstract

A purpose of this study was to conduct a systematic qualitative review of quantitative instruments used in mathematics education research during 2000-2020. This team of teacher education instrument (TEI) scholars explored measures of teachers’ affective characteristics, behavior, and other constructs whereas another team studied knowledge measures. This proposal builds upon past research from this project disseminated by this TEI team (Authors, 2024, 2023, 2022).

The review process was guided by Thunder and Berry’s (2016) guidelines for qualitative systematic reviews. Six TEI researchers worked in pairs: (a) searching Google Scholar for studies potentially using the instrument or reporting on the instrument, (b) choosing studies that may contain validity evidence and/or reliability information on the instrument, and (c) documenting that validity and reliability evidence. Related to (a), TEI pairs reviewed approximately 2,300 articles from 24 mathematics education journals. This extensive review process yielded 255 unique instruments. Those 255 instruments became our sample for analysis. For (b) and (c), the sample space for validity evidence included peer-reviewed journal articles, conference proceedings, dissertations, and white papers. Pairs of coders met periodically to discuss validity and reliability evidence and to reconcile any differences. Descriptive statistics were performed to determine the frequency of reported sources of validity evidence. Those sources of validity evidence are drawn from the Standards (AERA et al., 2014): test content, response processes, internal structure, relations to other variables, and consequences of testing. Data were further disaggregated by instrument to assess the presence of interpretation and/or use statements, claims, and the sources of validity evidence represented.

Findings indicated that the TEI team identified 480 instances of validity evidence across 158 of the 255 instruments reviewed. Table 1 displays the number of instruments and instances of validity evidence for each validity source. Additionally, 225 instances of reliability evidence across 142 of 255 instruments (55% of sample) were identified. Our TEI team was able to locate instrument score interpretation statements for 23 instruments (9% of sample) and score use statements for 49 instruments (19% of sample). More than one third of the instruments contained zero sources of validity evidence associated with them (97 instruments, 38%). On the other hand, 35 instruments (14% of sample) had validity evidence related to three or more validity sources.

Taken collectively, findings suggest that validity evidence was not located for many instruments, which is incongruous with current, modern guidelines for assessment development (AERA et al., 2014). It was common for descriptions of instrument use and interpretations to be missing. One implication of this study is a call for quantitative instrument developers to align practices with current guidelines. A second implication is opportunities for scholars to conduct validation studies and gather evidence, which may or may not support claims that an instrument measures what it intends. Ultimately, the study highlights the need for improved validation practices and provides examples and recommendations for more robust and meaningful evaluations of validity in the field of mathematics education.

Table 1. (Cannot paste a table in AERA system)

Authors