Search
Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Annual Meeting Housing and Travel
Personal Schedule
Sign In
X (Twitter)
The present project’s goal was to implement an intervention to engage teachers in analyzing student data as a FA process. In this project, three assessments evaluating student knowledge of 8th Grade Mathematics Common Core State Standards 8EE5, 8EE6, and 8EE7 (one pretest and two posttests respectively) were developed to collect formal evidence for the success of our intervention. Our sample included 857 8th grade Californian students in the control group and 1,252 students in the treatment group. Interestingly, our analyses found that our intervention was successful for improving the 8EE5 and 6 unit posttest scores (hereafter referred to as Post56), but not the 8EE7 unit posttest scores (hereafter referred to as Post7), even after controlling for Pre-test scores and demographic variables. While further research is necessary to see if our intervention effect is replicable, we want to evaluate whether the instruments we used in our study are appropriate measures. Our intervention specifically examined whether providing teachers with professional development in examining student errors and adjusting instruction to match their students’ needs; these practices are supported by research (Black & Wiliam, 2009; Chatterji, Koh, Choi, & Iyengar, 2009; Chudowsky, Glaser, & Pellegrino, 2001).
Using Item Response Theory, Rasch models were developed and evaluated in BIGSTEPS and R to refine these assessments, with special focus on whether the instruments were unidimensional or had poor-fitting items. In BIGSTEPS, items with infit or outfit greater than 1.5 were eliminated from the model and the models were re-ran. Principal components analyses were used for each model and helped identify which items to remove. While removing items and poor-fitting persons helped improve the fit of the Pre-test and Post56 IRT models, this did not work as well for the Post7 IRT model. It is possible that a well-fitting model for Post7 was not found because the instrument had 12 items. Perhaps with more items that fit well and cover a larger range of abilities, this assessment could be refined into a better measurement of student mastery of 8EE7 math standards.
Additionally, differential item functioning (DIF) between the treatment and control groups were also evaluated for all three assessments using logistic regression. As expected, no DIF was found on the pre-test, but DIF was found on the post-tests that mostly favored the treatment group. 14 out of the 22 items of Post56 had evidence of uniform DIF favoring the treatment group and 3 out of the 12 items of Post7 had evidence of uniform DIF, with one item biased towards the control students and two items biased towards the treatment students. However, after examining the effect sizes (R2) for each item, using <.035 as negligible DIF, .035-.070 as moderate DIF, and >.70 as large DIF, only two items on Post56 were identified as having moderate uniform DIF (Swaminathan & Rogers, 1990). The information obtained from IRT modeling and DIF modeling allow us to better refine the assessments used in our study which we hope to disseminate to teachers interested in utilizing out intervention in their classrooms.