AERA Annual Meeting: Application of Machine Learning Algorithms to Detect Treatment Effect Heterogeneity for Three-Level Multisite Experiments

Information Menu
Search Tips

Navigation and Settings Menu
Change Preferences / Time Zone
Sign In

Social Media Menu
Facebook
X (Twitter)

Back Home

Refresh: Off

Paper Summary

Share...

Direct link:

Application of Machine Learning Algorithms to Detect Treatment Effect Heterogeneity for Three-Level Multisite Experiments

In Event: Design and Analysis of Multisite Experimental Studies to Investigate Treatment Effect Heterogeneity

Sat, April 13, 3:05 to 4:35pm, Pennsylvania Convention Center, Floor: Level 100, Room 116

Abstract

Background
Educational researchers commonly incorporate a treatment by moderator interaction within regression analyses or multilevel models (MLMs) to estimate the moderator effects (e.g., Dong et al., 2022). Recent development in statistics and econometrics (e.g., Athey & Wager, 2019; Chernozhukov et al., 2020) proposed to use machine learning (ML) methods to explore the heterogeneous treatment effect (HTE) by estimating the conditional average treatment effect (CATE). Compared to traditional interaction/moderation analysis, these methods have some advantages. For example, traditional moderation analysis usually requires specifying the moderators in the design phase and thus may miss important sources of HTEs. However, ML methods can select moderators from a potentially large number of covariates.

Similarly to MLMs, when applying ML methods to estimate CATE, applied researchers still need to consider the nested data structure. However, most prior literature assumes the participants are independent (e.g., Jacob, 2021). There is a lack of literature to guide educational researchers in appropriately applying ML methods for clustered data when evaluating HTEs.

Purpose and Significance
This study contributes to the literature on the design and analysis of multisite experimental studies by comparing the current available ML methods and tools that account for the nested data structure when estimating CATE, using data from a large-scale three-level multisite randomized trial and Monte Caro Simulations.

Methods and Data
Based on our review of all the currently available methods and packages, only two algorithms - the cluster-robust causal forest (Athey & Wager, 2019) and the GenericML (Chernozhukov et al., 2020) consider the nested data structure. Specifically, the cluster-robust causal forest algorithm can be applied through the R package grf, and the GenericML algorithm can be implemented through the GenericML R package. Both packages report cluster-Robust SEs. Besides, the GenericML package can also estimate sorted the group average treatment effects (GATEs) that consisted of creating five groups of participants using quintiles of the CATE distribution and perform classification analysis to explore the relationships between covariates and the CATE.

We will apply the cluster-robust causal forest and the GenericML algorithms as well as other widely used ML methods (e.g., DR-, S-, T-, R-, and X-learners; Jacob, 2021) that do not specifically consider the nested data cluster to (1) the data from a large-scale three-level multisite experimental study (Leite et al., 2023) and (2) the data from Monte Caro Simulations. The multisite experimental study included 52 math teachers and 2,936 students from three school districts. It randomly assigned students of participating teachers to see video recommendations. Our analysis includes 516 predictors, with 484 consisting of dummy-coded indicators. Table 1 summarizes the simulation conditions.

Preliminary Results
Table 2 summarizes the GATEs using the GenericML package and the multisite experimental data, showing the difference between the group that benefitted the most (Group 5) and the least (Group 1) from the intervention. Results from other algorithms and simulated datasets will be provided and compared. We will also offer recommendations to applied researchers on choosing the appropriate methods and statistical package among alternative ML methods.

Application of Machine Learning Algorithms to Detect Treatment Effect Heterogeneity for Three-Level Multisite Experiments

Abstract

Authors