Session Submission Summary
Share...

Direct link:

Leveraging Voice AI (Artificial Intelligence) to assess early grade reading skills and support learning

Mon, March 24, 9:45 to 11:00am, Palmer House, Floor: 3rd Floor, Salon 6

Group Submission Type: Formal Panel Session

Proposal

The current learning crisis is deep, widespread and growing, with an estimated 70% of 10-year-olds unable to understand a simple written text, according to a report published by the World Bank, UNESCO, UNICEF, UK government Foreign Commonwealth and Development Office (FCDO), USAID, and the Bill & Melinda Gates Foundation in October 2022.

Improving learning at scale is hard, and this is compounded by a significant lack of data on children’s reading ability. This issue is particularly pronounced in low- and middle-income countries (LMIC) as well as in the early grades, where regular data is scarce. Such data is needed to prioritize foundational learning for all, including supporting interventions in the classroom and informing policy decision-making at the system level.

Collecting data of children early reading abilities at scale involves a significant logistical and costly effort. One of the primary reasons for this is because early reading abilities are best assessed orally, through listening to how children decode and understand text aloud. This involves one-on-one administration, which is both, time-consuming and resource-intensive.

The Early Grade Reading Assessment (EGRA) is the most widely known and used assessment for early reading globally. It has been implemented in multiple languages over 70 countries and major agencies and donors including USAID, FCDO, GPE, WB, and Learning Metrics Task Force are engaged in promoting EGRA-type, one-on-one assessments, to advance reading fluency skills.

A typical EGRA-type assessment measures foundational literacy skills. These skills, each aligned with key developmental milestones in early childhood, include, but are not limited to: phonemic awareness (developed between the ages of 4 to 7 approx.); vocabulary, story-telling, and oral language comprehension (developed between the ages of 4 to 8 approx.); letter recognition (developed between the ages of 5 to 7 approx.); word reading, non-word reading, oral reading fluency, reading comprehension, and reading prosody (developed between the ages of 7 to 10 approx.).

Voice AI (Artificial Intelligence) models, and particularly automatic speech recognition, or ASR models, have the potential to revolutionize education as AI technologies develop. When it comes to assessing early reading skills, they can reduce the cost and logistical effort that involves assessing reading ability if trained well. With features like voice recording and automatic marking, results can also be fed back into the system and provided directly to teachers and students, making the assessment process more efficient and impactful. Going a step further, and combined with text-to-speech (TTS) models and natural language processing, they can be instrumental in supporting young learners to improve learning with features like real-time feedback as a child reads aloud.

This panel will begin by exploring the development and application of Voice AI systems to support the assessment of early reading skills, focusing on the challenges of using off-the-shelf AI technologies, which are often trained on adult voice data and resource-rich languages, such as English.

The first presentation introduces a critical technical challenge: the inadequacy of current Voice AI models in accurately assessing early literacy in children, especially in low-resource languages. This presentation delves into why existing systems, which are primarily trained on adult voices and resource-rich languages, struggle to meet the unique needs of early readers. When children are learning to read, for example, they often take a few seconds to sound out a word, making various noises or partial sounds as they piece together the letters and syllables before arriving at the correct pronunciation. The challenge lies in ensuring that Automatic Speech Recognition (ASR) systems are sophisticated enough to differentiate between these intermediate sounds and the final correct pronunciation. ASR must filter out background noise and partial utterances while still accurately recognizing words and letter sounds, a complexity that current systems often fail to address.

The second presentation provides a global overview of what has been done to date in terms of fine-tuning Voice AI models for children’s voices across multiple languages in the Global South. It builds on the technical discussion introduced by speaker 1 and focuses on what has been done so far and what investments are needed to create scalable solutions like EGRA-type assessments. It will address issues around consent, data security, storage and management of open-access databases storing children’s voices – all of which will need to align with national standards.

The third presentation transitions and highlights the application of an AI-powered Oral Reading Fluency (ORF) tool in India, designed to identify students in need of remediation support. The tool addresses key challenges in India, where teachers face difficulties in conducting frequent and uniform assessments, and where the absence of a common repository makes it difficult to track the progress of an individual student. By automating reading fluency assessments, the ORF tool offers a scalable, reliable, and accurate evaluation, particularly in contexts with high pupil-teacher ratios. Already piloted across several states in India, this presentation will share insights on the deployment of the tool and its integration into dashboards that aggregate learning data at various administrative levels, from school block, district, and State-level education departments.

The fourth presentation builds on the theme of applying AI to support reading fluency but shifts focus to a more interactive, student-centric approach. The presentation will showcase a read-aloud app, which provides real-time feedback to students as they read decodable books. This presentation complements the earlier discussion on assessment by demonstrating how AI can also support active learning and improvement. The focus on scalability across different languages and regions ties back to the global discussion presented by speaker 2, offering a broader vision for expanding these solutions to underserved populations.

Sub Unit

Organizer

Chair

Individual Presentations

Discussant