Individual Submission Summary
Share...

Direct link:

Scaling up remedial education in India: evidence from two RCTs of the same program at different scales

Wed, March 26, 2:45 to 4:00pm, Palmer House, Floor: 7th Floor, Clark 5

Proposal

For the past fifteen years enrollment of school-age children in India has been consistently above 95% (ASER, 2023, UIS, 2024). Yet learning remains stubbornly low. According to the Annual Status of Education Report of 2022, the most recent year of the nationally-representative survey, only 21% of children in grade 3 and 43% of children in grade 5 could read at the grade 2-level. 76% of children in grade 5 were unable to complete a subtraction problem from the grade 2 curriculum (ASER, 2023).

To address this crisis in learning, remedial education programs have been introduced across India. Experimental studies have found that these programs can have large effects on helping lagging students catch up. But the magnitude of impact varies by program and context. At the high end of treatment effects, researchers found that training community volunteers to deliver two hours of after-school remedial instruction per day led to 0.75 standard deviation (SD) increases in test scores relative to a control group after 18 months (Lakshminarayana et al., 2013). Banerjee et al (2007) find similar effects from a program that recruits young women from local communities to tutor students during school hours in basic literacy and numeracy; students who received tutoring improved test scores by 0.6 SD at the end of two years relative to control students.

Other remedial education programs have had more modest effects. A series of five experiments assessing ten different remedial education interventions in India found effects on test scores ranging from near-zero and statistically insignificant to 0.7 SD (Banerjee et al., 2016). The authors observed that exact replications of a previously successful model can generate similar impact, but that deviations from that model – implementing during school hours versus after school, running summer camps versus programs during the school year, implementation by volunteers versus paid contract teachers versus government teachers – can greatly influence the magnitude of the effect.

While the impacts of remedial interventions have been extensively studied in pilot programs, a key question is whether the effective interventions can maintain impact at scale. Programs often fail to make the jump to impact at scale. For example, Bold et al. (2018) studied the effects of giving grants to primary school parent-teacher associations to hire extra teachers to reduce class size. As a pilot the program was effective at improving test scores, but when the government attempted to scale it up, the program had null effects. The authors highlight challenges in hiring enough new teachers, monitoring the program, delays in paying salaries to new teachers, and political backlash from teacher unions as some of the reasons why the program failed to have similar impact at scale. Banerjee et al. (2017) examined attempts to scale teaching-at-the-right-level (TaRL) programs in India, and found that although some programs succeeded when scaled, others fail to remain impactful. The authors point to piloting bias, site-selection bias, and other factors as obstacles for replicating impact at scale.

This paper contributes to the literature on whether and how a program implemented successfully as a pilot can be impactful at scale. The program in this study is run by the non-profit Educate Girls in northern India, and it involves remedial instruction implemented by community volunteers in government schools. From 2015-18 we conducted a randomized controlled trial (RCT) of this program when it was implemented in 160 schools for 12,000 children. From 2022-24 we conducted an RCT of the same program as it was implemented in more than 5,000 schools for 260,000 children.

In the first RCT of the program, students in treatment schools gained 0.44 SD on tests of foundational literacy and numeracy relative to control students. The program had a modest effect on learning outcomes in the first two years, with the majority of impact coming in the third year. Heterogeneity analysis suggests that the implementer updated the program design in the last two years of the evaluation in response to results from the first year.

Following the first RCT, the program was scaled, and a second RCT was designed and implemented using the same measurement tools and methodological approach. After the first year of the program, students in treatment schools gained a modest 0.18 SD relative to students in control schools. After the second year of the program, students in treatment schools gained an impressive 1.25 SD relative to control students, with most of the gains coming in the second year. Treatment effects were large and positive across subjects (Hindi, Math, English), grades (3, 4, 5), gender, and district. Treatment effects were slightly smaller for students who were exposed to only one year of the program, but still very large, statistically significant, and positive across all subgroups.

We examine this unusual case where impact grew as the program scaled, and highlight lessons for implementers scaling their programs. First, Educate Girls and their funders had a strong focus on achieving learning outcomes, with the pilot program explicitly tying impact estimates to outcomes-based payments and the at-scale program setting specific learning targets with regular reporting to funders. Second, in part a consequence of the outcomes-based focus, Educate Girls was encouraged to experiment with program design, and in both the pilot program and the at-scale program they made significant changes that likely led to greater learning gains. Third, as Educate Girls scaled, they invested heavily in performance management systems in which staff and volunteers conducted regular rapid assessments and tracked individual child progress, allowing the organization to frequently reallocate staff and resources where needed. Finally, there were contextual factors during the at-scale implementation that may have led the at-scale program to have more impact than the pilot program, including complementarities with a new government foundational literacy and numeracy curriculum.

Author