AERA Annual Meeting: Capturing Students Metacognitive Study Strategies using NLP: Creating Pathways for Research and Practice

Information Menu
Search Tips

Navigation and Settings Menu
Change Preferences / Time Zone
Sign In

Back Home

Refresh: Off

Paper Summary

Share...

Direct link:

Capturing Students Metacognitive Study Strategies using NLP: Creating Pathways for Research and Practice

In Event: Building Theory-Informed and Context-Sensitive Natural Language Processing to Capture Self-Regulated Learning in Action

Thu, April 24, 5:25 to 6:55pm MDT (5:25 to 6:55pm MDT), The Colorado Convention Center, Floor: Terrace Level, Bluebird Ballroom Room 2B

Abstract

Objectives/Theoretical Framework
Learning entails a host of strategies students can employ to acquire and apply knowledge across contexts. Although prior research has shown that some strategies result in more robust knowledge than others (see Dunlosky et al., 2013 for a review), students are rarely explicitly taught how to study and learn (Dignath & Büttner, 2018). Additionally, laboratory studies have shown that students underutilize effective strategies (e.g., Ariel & Karpicke, 2018). Therefore, there is a need to understand and improve students’ study strategies. However, doing so in an authentic, scalable, and efficient manner is a challenge. Some work has sought to capture students’ awareness of their strategy use (i.e., metacognitive study strategies) with measures that prompt students with particular strategies measured at a general level (Bartoszewski & Gurung, 2015; Hartwig & Dunlosky, 2012; Karpicke, 2009; Kornell & Bjork, 2007), making it difficult to understand how students use these strategies when studying for a particular context. To address this, recent work captured students' study strategies using open-ended questions and human raters (e.g., Authors, date), but this is time-consuming and impractical for real-time use. Here we leverage the use of natural language processing (NLP) to speed up and scale the coding process, enabling the automatic assessment of strategies used and the ability to quickly personalize interventions based on students’ experiences.
Method
Undergraduates from five psychology courses across two universities participated in the study as part of their regular classroom activities by responding to, “In a paragraph or two, please describe your study techniques.” Two coders separately coded the data in which a strategy was either coded as absent (0) or present (1). They separately coded the data and met to resolve disagreements (kappa > .70; percent agreement > 90%). See Table 1 for the coding protocol.

From there, we used a DeBERTa pretrained model as a base to finetune a multilabel text classification model to predict the presence of specific strategies in student writing using item-wise f1 scores as the loss function. To make the best use of the data, we performed an 80/20 train/validation split.

Results
Results from the classification model (Table 2) revealed that the accuracy of capturing the strategies was highly sensitive to the prevalence of each strategy in the training data. Strategies with more than 150 occurrences in the training data reported f1 scores > 0.8. As the occurrences fell below 150, f1 scores dropped (e.g., Self-Explanation). Figure 1 shows the relationship between the occurrences of each strategy in the training data and their respective f1 scores.
Discussion/Significance
This study shows the potential for using large language models to classify student texts according to the study strategies they describe by achieving sufficiently accurate results (f1>0.8) for all categories with more than 150 occurrences in the training dataset. In future studies we plan on using data augmentation methods, including the generation of synthetic data, to achieve more balanced classes and improve the accuracy of our model for these rarer study strategies.