Individual Submission Summary
Share...

Direct link:

AI-driven English Learning in Vernacular Contexts: Lessons from Chatterbot Pilot in India

Sat, March 22, 1:15 to 2:30pm, Palmer House, Floor: 3rd Floor, The Madison Room

Proposal

The integration of Artificial Intelligence (AI) into education has the potential to revolutionize learning, with its central advantage often cited as personalized learning and tailoring educational experiences to meet individual student needs. However, the real transformative potential of AI in education is not simply about personalization; it lies in how effectively AI-driven solutions are integrated into diverse learning contexts, particularly for learners in underserved communities.

The English Language Program (ELP) is a comprehensive, human-AI hybrid initiative aimed at improving English skills among learners in rural & semi-urban vernacular contexts, especially in the absence of a highly trained cadre of English language instructors. In the last 2 years, the program has reached 50,000 children (aged 11-14) through over 4,800 youth volunteers (aged 14 & above) across 14 states in India. This program's three key features - scalability, engagement, and credibility - are underpinned by empowering youth volunteers (including parents) with technology tools, utilizing an automated digital assessment system to identify learning levels and track learning progression, and leveraging WhatsApp for network building and disseminating level-appropriate content. This approach ensures the widespread replication and scalability of the solution.

Within this broader ELP framework, the Chatterbot, which is a conversational AI tool, addresses a significant gap in speech practice & opportunities for practical language use, particularly in rural & semi-urban contexts. This setup facilitates the program's goal of effective English language acquisition and supports the broader mission of the ELP to address the educational needs of underserved communities.

To ensure that the Chatterbot aligns with effective pedagogical principles, a high-level pedagogy rubric, informed by literature reviews and participatory sessions with learning designers, was developed. From a pedagogical perspective, the chatbot's language was designed to be clear and accessible to non-native English learners, with limits set to avoid overly complex vocabulary (e.g., 6-7 letters, with content up to 50 words for clarification requests).

The pilot was conducted across three distinct linguistic contexts (3 states in India): Jharkhand, Maharashtra and West Bengal. Each state represents a unique regional language environment, allowing us to test the Chatterbot’s adaptability and effectiveness in diverse linguistic settings. Over the course of the pilot, a total of 42 students engaged with the chatbot, initiating 1,384 conversations. To effectively assess and refine the Chatterbot’s application, we conducted a pilot in communities with existing vocational skilling programs. This strategic choice allowed us to contextualize the Chatterbot’s application within real-life scenarios as well as offering industry-relevant scenarios, for instance use in healthcare and hospitality courses, for English practice and also tapping into learner’s intrinsic motivation to engage on the product. This pilot phase was crucial for testing the Chatterbot’s effectiveness in real-world contexts and ensuring it met the needs of both tutors and learners.

To thoroughly study the user & bot behavior, a mixed method approach was utilized to capture and analyze the user interactions and engagement patterns recorded from the backend of the Chatterbot. An evaluation taxonomy was prepared to address practical considerations in pedagogical evaluations, resulting in diverse benchmarks that offer a comprehensive view of the Chatterbot's pedagogical effectiveness. Some of the metrics included were response relevance, recognition accuracy, effectiveness of initial / base prompts, adaptability, safety and re-engagement. The Edu-Convo Kit[1] was employed to measure the conversational metrics of both the chatbot and the users. The kit facilitated the breakdown of dialogue between user and chatbot based on talktime which provided insights into the flow and engagement of the conversations.

Additionally, the McAlpine Eflaw[2] Readability Score was used to assess the readability and linguistic complexity of the chatbot’s responses. The mean readability score of 14 indicated that the bot’s responses were generally easy to understand, though there were occasional instances where higher scores were observed in some cases where the bot was prompted to provide detailed explanations or instructions. The pilot also addressed privacy and safety concerns, particularly regarding the handling of digital personal data which prompted a review of the chatbot’s dialogue structures to ensure better compliance with privacy protocols and safeguard user information in future implementations. These approaches coupled with use of LLM’s for evaluation were essential in refining the Chatterbot’s design.

In its current iteration, the Chatterbot can be used as part of a program or independently. Based on the learnings from the pilots, the Chatterbot is set to serve a wide range of learners, including tutors, parents and independent learners, with tailored content to cater to a lower level of learners (low vocabulary and comprehension skills) as well as to advanced learners and youth who have completed earlier stages, emphasizing free-flowing conversations and complex comprehension. Future developments may include integrating reading practice and enhancing adaptability to better meet diverse learning needs.


References:
1. Wang, R. E., & Demszky, D. (2024). An Open-Source Library for Education Conversation Data. Stanford University. https://arxiv.org/pdf/2402.05111
2. Nirmaldasan. (2015). McAlpine EFLAW Readability Score. Readability Monitor. Email: nirmaldasan@hotmail.com. Available from: https://strainindex.wordpress.com/2009/04/30/mcalpine-eflaw-readability-score/

Authors