Individual Submission Summary
Share...

Direct link:

EGRA-AI: Automating Early Grade Reading Assessments (EGRA) in African Languages Using Voice-Recognition AI

Thu, March 14, 3:15 to 4:45pm, Hyatt Regency Miami, Floor: Third Level, Foster 2

Proposal

rapid rise of artificial intelligence (AI) has led to a proliferation of use-cases in health, communication, logistics, manufacturing, as well as education. Although this new technology brings innovations and efficiencies in these domains, it has not been without critique. While some scholars have focused on the geopolitical and existential risks of an unregulated AI-industry, many have identified the social and political consequences of enhanced surveillance networks and their ability to identify otherwise vulnerable groups. For example, the Chinese state uses facial recognition technology to identify and monitor Uyghur minorities (Harwell & Dou, 2020), the British police have also used facial recognition to track protesters (Dodd, 2023), with similar technologies now able to detect sexual orientation with some accuracy (Kosinski & Wange, 2917). These examples show how these new technologies can be deployed for nefarious purposes, and also foreground the issues identified in the CIES 2024 call for proposals around protest. Another stream of research identifies the systemic biases that become encoded into these algorithms when they are trained using biased data, or data that systematically over-represents one group.
In the field of comparative and international education these biases are also evident. For example, AI models that require transcribed training data generally use what is already available on the internet. In 2021 more than 50% of the content of the internet was in English compared to 0,7% in Arabic and 0,1% in Hindi (WTS, 2021). This is despite the fact that more than a billion people speak either Arabic or Hindi globally. Models that are trained almost exclusively on English data (with Western, Northern values and cultural norms embedded in them) may prove to be biased when used in non-English settings.
In this paper we aim to report on a multidisciplinary research project undertaken in South Africa in 2023 which begins to address one of the above challenges. We use a new Open-Source unsupervised voice-recognition model (wave-2-vec) to validate whether the model can accurately convert speech to text at the phoneme level in a small African language (Northern Sotho, also known as Sepedi). To do so we collected data on more than 400 Grade 2 and 3 children from 20 no-fee schools in South Africa in 2023. We are able to compare the outcomes of two approaches. Firstly, we collect standard reading data from the manually-administered Early Grade Reading Assessment (EGRA), i.e. testing letter-sound knowledge and word reading skills. We then test the same children, in the same language, at the same time, using the alternative self-administered AI-powered voice-recognition model (EGRA-AI). We then compare the rank order and skill-levels of children across the two tests to determine whether the self-administered EGRA-AI is able to accurately assess the early reading skills (letter sounds and word reading) relative to the current standard assessment (EGRA).
Our results have important implications for future research on reading in African languages and the use of AI in the Global South. If new models that do not require vast sets of training data can produce similarly transformative use-cases.

Author