Search
On-Site Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Change Preferences / Time Zone
Sign In
Bluesky
Threads
X (Twitter)
YouTube
The objective of this study is to examine the reliability and predictive validity of the CLASS 2nd Edition tool in the early grades (Pre-K – 2nd grade). The success of using CLASS 2nd Edition in education systems, like any standardized observation measure, inherently depends on several assumptions. First, we want to be sure that the domain scores are accurate estimates of the dimensions (i.e., constructs) that they represent. Second, it is important that the scores obtained from observers are reliable estimates of the classroom. Lastly, it is necessary that the tool is validated by being predictive of other subsequent outcomes expected to result from the observed constructs.
This study used CLASS 2nd Edition observations collected during the 2022-2023 school year. There are n=515 Pre-K, 258 kindergarten, 290 1st, and 264 2nd grade classrooms. For outcomes measures, we used the CIRCLE pre-K assessment (cliengage.org) and the NWEA MAP assessment for K-2nd grade (nwea.org/the-map-suite/). CIRCLE measures include continuous scores in math, phonemic awareness, and vocabulary. MAP measures include reading and mathematics.
We conducted reliability analyses for internal consistency and inter-rater reliability. For the former, we computed the conventional Cronbach’s Alpha coefficient for each domain and for the overall scores under the three-factor model. For the inter-rater reliability, we computed the rates of agreement by Kappa and as the intraclass correlation coefficient (ICC) by using a random-effects modeling approach. We collected qualitative data from our veteran CLASS observers to summarize their experience transitioning to 2nd Edition.
For validity analyses CORE first computed correlations between student outcomes and domain-level CLASS scores. Then, in multilevel structure equation modeling (MSEM) analyses, student and teacher were the lower and higher level of the models, respectively. At the lower level of the model, student outcomes (per test type and grade-level) were dependent variables. We used student-level and teacher-level demographic variables in the model as covariates (e.g., race, language, degrees for teacher). At the teacher-level of the model, CORE used the three-factor measurement model for CLASS scores with three correlated dimensions of quality as latent factors predicting student-level outcome intercepts. The three path coefficients per outcome type were the main point of interest as the magnitudes of classroom-level quality impact on student outcomes.
For reliability, all estimates indicated a good level of reliability of all factors as well as the entire tool. Interrater reliability was good with evidence of variation based on grade level. Our strongest evidence came from feedback from observers, who reported that their perspectives of the tools improvement. CLASS 2nd Edition was found to be significantly associated with student end-of-year and growth outcomes across multiple grade levels and content areas.
Findings from this study support the use of CLASS 2nd Edition as a reliable and valid measure of classroom quality in the early grades. This evidence supports educators seeking to implement practice-based quality measures that will provide actionable data that is accurate, reliable, and likely to result in improvements in the classroom that are associated with stronger outcomes for young learners.