Paper Summary
Share...

Direct link:

Assessing Critical Thinking in Times of ChatGPT: Think-Aloud Protocol Analysis as an Alternative Approach

Sat, April 13, 1:15 to 2:45pm, Pennsylvania Convention Center, Floor: Level 100, Room 116

Abstract

Objectives
The advent of easy-access AI writing tools like ChatGPT sent shockwaves through higher education. There is not yet an analog to plagiarism checkers for AI-developed essays, which has called into question the validity of term papers as the standard deliverables for assessing learning progress in college classes. AI also challenges the measurement of critical thinking (CT) through performance assessment tasks (PA), because it relies on essays that integrate complex information. While the PA approach traditionally limits the documents available for a task, the field is currently moving towards critical online reasoning (COR) in order to keep up with the larger trend in higher education towards moving information sources completely online (e.g., libraries, textbooks). With the right prompts in the interactive dialogue, ChatGPT would potentially be able to produce a high-quality CT essay that does not reflect the actual skill level of the student.
Framework
A promising way to safeguard CT assessment lies in the shift to monitor and evaluate the process rather than the product. At its core, CT refers to a cognitive process, with the essay being its product. While the product is vulnerable to AI, monitoring the process is not: ChatGPT can render a perfectly structured, argumentatively balanced, short essay regarding the question “Do e-bikes promote health?”, but the underlying cognitive process remains a black box.
Currently, there are two ways to create process data. One is based on the heuristic-systematic model[13], where the internet search process is part of the PA task becomes itself a data source to assess critical thinking – in addition to or instead of the final essay. Log-files of the search history, eye tracking information, or a self-report survey of the information processing provide complex raw data (see proposal by [authors’ 2]).
Methods
The current study used transcribed think-aloud protocols that were recorded while students were working on a “Bryn Bower” (BB) task for 50 minutes. Different from the open format of COR tasks, BB tasks are closed format (i.e., there are a limited number of documents students can use, usually seven to nine[0]. This approach has the advantage that the universe of arguments is known (usually around 30), as is the trustworthiness and relevance of the sources[12]. This conceptual feature allows a parallel analysis of process (think-aloud) and product (final essay) with the same coding scheme or rubric.
Data
For participation credit, 42 undergraduates (72% female, ages ranged from 18-25) were instructed to verbalize while working on a BB task[23]. The transcriptions and the essays were independently coded.
Results
Results show high correlations in the arithmetic scoring for process and product data. However, the analysis also revealed that more arguments were used in the essay, suggesting that a significant part of CT happens during the writing phase.
Significance
This study indicates, that think-aloud protocols need to include the reading and the writing phase if they are used as CT measures in future research.

Authors