Search
On-Site Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Change Preferences / Time Zone
Sign In
Bluesky
Threads
X (Twitter)
YouTube
Objectives: AI tools are increasingly used to generate decodable texts. However, both decodable and digital texts have drawbacks for early readers—decodable texts often sacrifice meaningful content for phonetic control, while digital texts can overwhelm with distracting multimedia elements (Authors, 2025). As part of the Year 1 Exploratory Study for the Center for Early Literacy and Responsible AI (CELaRAI), we designed a decodable text rubric to operationalize the theoretical and empirical components of appropriate texts for beginning readers. This paper describes the rubric and evaluations of AI-generated decodable texts to answer the research question: How do commercially available AI-generated decodable texts align with research-based practices regarding decoding, comprehension, and cultural relevance?
Framework: Decodable texts (written using a limited number of grapheme-phoneme correspondences) are often recommended/mandated for K-2 students to promote orthographic mapping of novel words during reading (Ehri, 2014). However, multicriteria texts that incorporate decodable words with content or thematic focus, engagement, high frequency words, word repetition, and vocabulary difficulty are more beneficial for beginning readers than texts designed based on decodability alone (Cheatham & Allor, 2012; Mesmer et al., 2015; Hiebert & Fisher, 2007; Odo, 2024; Pugh et al., 2023). Thus, we use a multicriteria approach in our rubric design.
Methods: We used a design-based approach (Reinking & Bradley, 2007) to develop the Digital/AI Text Platform Analysis Rubric. We (1) identified relevant research to inform the initial criteria; (2) honed operationalization of the criteria across three iterative rounds of testing, evaluating, and redesigning based on scoring decodable texts from three commercial platforms, Ello, LitLab, and Project Read; and (3) had a teacher assess the rubric’s utility. The final rubric includes: (1) code analysis (ten criteria about word recognition challenges), (2) meaning analysis (five criteria about vocabulary and comprehension challenges), (3) cultural relevance analysis (four criteria based on student-text match characteristics), and (4) platform analysis (five criteria about text selection and progression).
Data sources included nine scored rubrics and 216 data points. Data analysis included entering the scores for each rubric in a database and sorting the database to identify patterns across criteria.
Results: Across platforms, AI-generated decodable texts were better aligned to research on code-related than meaning-related features. However, code-related features were categorized very broadly (e.g., “long vowels”). Multisyllabic and multiple meaning words were frequently included with no support. Attempts at cultural relevance were often superficial. Most programs had only a handful of texts available focused on any given text feature, limiting students’ opportunities to practice the same patterns multiple times with novel texts. These findings inform our design of the AI Reading Enhancer (AIRE) text generation tool, the goal of the CELaRAI Year 2 Design Study.
Significance: AI has the potential to improve text generation and personalization, but AI-generated texts, like human authored texts, are often unbalanced in their supports for beginning readers. The Digital/AI Text Platform Analysis Rubric moves researchers and practitioners toward a shared understanding of the multidimensional criteria that should be considered in evaluating and designing these texts.