Paper Summary
Share...

Direct link:

Test Format and the Variation of Gender Achievement Gaps Within the United States

Tue, April 12, 2:15 to 3:45pm, Convention Center, Floor: Level Two, Room 209 B

Abstract

We examine whether state accountability tests with differing proportions of multiple-choice items produce different estimates of the size of gender achievement gaps. A small body of prior research indicates that, on average, males perform better than females on multiple-choice questions, relative to their performance on constructed-response questions. (Lindberg et al., 2010; Beller & Gafni, 2000; Garner & Engelhard, 1999; DeMars, 1998; Ben-Shakhar & Sinai, 1991). Because each state’s accountability tests may include different proportions of multiple-choice and constructed-response items, boys and girls may be differentially advantaged/disadvantaged relative to one other as a result of the item structure of their state test.

We use student test score results in grades 4 and 8 in 2009 from three different tests: (1) state accountability tests (we have data from all 50 states and roughly 9,400 school districts); (2) the Measures of Academic Progress (MAP) assessment administered by the Northwest Evaluation Association (NWEA) (we have data from roughly 3,700 school districts); and (3) NAEP tests administered (all 50 states). State accountability tests vary in item format among states; the NWEA and NAEP tests have a common item structure across states.

We leverage the variation in the item format across the tests to understand how performance gaps correlate with the proportion of multiple-choice, short constructed-response, and extended-response questions on the state accountability tests. We model the difference between the gender gaps measured on the state accountability tests and on an “audit” test (either NAEP or the NWEA tests, both of which have the same format in each state) as a function of the proportion of multiple-choice items on the state test.

We first note that states vary substantially in the proportion of multiple-choice items on their tests in math and ELA (ranging from 50-100% multiple choice). We find that boys do better on multiple-choice tests than girls of the same academic skill. Specifically, our estimates imply that gender gaps are greater (favoring boys more) on multiple-choice tests than on constructed-response item tests. These results appear to be driven primarily by gender-by-item format interactions affecting performance on ELA tests: in ELA, gender gaps on multiple-choice tests are larger (favoring girls less and boys more) than on constructed-response tests. On math tests, the difference in performance is smaller, but still favoring girls less and boys more on multiple-choice tests than on constructed-response tests. These patterns are consistent regardless of whether we use NAEP or NWEA tests as the audit test.

This is the first analysis of its kind to explore the interplay between each state’s standardized test format and differences in gender achievement on those tests. Such an understanding is timely because scores on these tests may have consequences for schools and students. Moreover this paper provides a better understanding of how the selection and use of different accountability assessments, with different test formats, may distort the interpretation of the size and direction of the gap, impacting comparisons across states.

Authors