Search
On-Site Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Change Preferences / Time Zone
Sign In
Bluesky
Threads
X (Twitter)
YouTube
Objectives
This study investigates the cultural responsiveness of Generative AI (GenAI)-generated lesson plans for STEAM education in Ghanaian contexts. The study responds to the growing use of GenAI in lesson planning and the need to evaluate GenAI’s contextual relevance for underrepresented regions in Sub-Saharan African classrooms. The study answered two research questions (RQ):
1. How do GenAI-generated STEM lesson plans reflect Ghanaian cultural knowledge systems and pedagogical values?
2. How do human expert evaluations inform improvements in GenAI content for localized STEAM instruction?
Theoretical Framework
The study is situated within Culturally Responsive Pedagogy (CRP) and Human-in-the-Loop (HITL) evaluation frameworks (Gay, 2015; Güvel et al., 2025). The CRP highlights the role of cultural knowledge, experiences, and identities in effective lesson planning and it informs the development of the evaluation rubrics. Human-in-the-loop evaluation positions human experts as critical co-designers and model evaluators in shaping AI-generated content (Hirosawa et al., 2024).
Methods
Participants
Four STEAM education experts were purposively selected based on: (1) at least five years of experience in their STEAM fields, (2) cultural competency in Ghanaian languages and educational practices, and (3) experience in teacher preparation and field evaluation.
Data Collection
Data collection involved two main steps. First, experts selected lesson topics from the Ghana Education Service curriculum and used these objectives to prompt a customized GPT tool, the Culturally Responsive Lesson Planner, to generate lesson plans. Both the AI-generated and curriculum-based plans were posted on a shared “Copy-Paste-and-Review Board.” Experts then evaluated the AI outputs using a validated rubric assessing six key areas on a five-point scale, providing comments with their ratings.
Analysis
To answer RQ 1, a descriptive statistical approach was employed, using frequency counts and means to analyze the numerical ratings provided by the experts. To address RQ 2, an inductive thematic analysis was conducted on the experts’ reflective comments.
Results
In answering RQ 1, the rating analysis shows that out of the five dimensions evaluated; teacher agency to adjust, be creative, and collaborate, was rated as highest mean score (M=4.60). This was followed by GenAI-generated output reflected local content well and avoided most forms of bias (M=4.53), etc. In their reflections addressing RQ 2, the experts confirmed that the GenAI outputs were more culturally responsive than the lesson plans proposed in the official curriculum. They acknowledged the practicality of these outputs for classroom implementation. Nonetheless, they commented that the output lacked attention to cross-cutting issues and global competencies and showed some limitations in fully incorporating local languages and diverse cultures.
Scholarly Significance
This study highlights the importance of localization, empowerment, and cultural responsiveness in evaluating GenAI-generated lesson plans. It is among the first empirical investigations to examine how human experts in the Global South assess the contextual alignment of GenAI outputs with national curricula. The findings highlight areas for future research, especially in global skills, cross-cutting themes, and subculture language inclusion in GenAI.