Paper Summary
Share...

Direct link:

Advances in Automating Feedback for Argumentative Writing: Feedback Prize as a Case Study

Sun, April 14, 9:35 to 11:05am, Pennsylvania Convention Center, Floor: Level 100, Room 119B

Abstract

Purpose
This chapter discusses the Feedback Prize project, aimed at advancing automated feedback for argumentative writing at the discourse level. Argumentative writing is critical for academic success and the development of essential skills. Discourse-level feedback in particular enables students to better understand different components of argumentation, enhance metacognition, and identify opportunities for improvement. This chapter advocates for discourse-level feedback in automated writing evaluation (AWE) systems, with an eye toward algorithmic fairness. The Feedback Prize project aimed to foster the integration of discourse-level feedback in AWE systems by providing open datasets, open-source NLP algorithms, and open data science competitions. The project focused on developing algorithms that were highly accurate, efficient, and unbiased, ensuring fairness across student populations.

Theoretical Framework
When feedback is provided to students at the discourse level, students gain a better understanding of the different components of argumentation. They have increased awareness and metacognition of their own learning and writing processes (Arroyo et al., 2021). Targeted feedback at the argument level is also valuable for historically underperforming students because it can help them believe their low achievement is surmountable (Brown et al., 2016). The discourse elements chosen for the PERSUADE corpus annotation scheme come from adapted or simplified versions of the Toulmin argumentative framework (Toulmin, 1958).

Methods
The PERSUADE corpus contains around 26,000 annotated student essays from diverse sources, accommodating various learning settings. Seven discourse elements were identified (i.e., Lead, Position, Claim, Counterclaim, Evidence, Rebuttal, Concluding Statement), and inter-rater agreement was high (weighted Cohen’s kappa=0.74). Equity was considered by oversampling U.S. racial minority groups and aligning with national economic background averages.

Data
The Feedback Prize project hosted two data science competitions to develop open-source algorithms for assisted writing feedback. The first competition, “Evaluating Student Writing,” focused on segmenting essays and labeling discourse elements. The second competition, "Predicting Effective Arguments," aimed for efficient algorithms to predict argumentative element quality on a three-tier scale (Ineffective, Adequate, Effective).

Results
Winning solutions from Feedback Prize 1.0 and 2.0 utilized ensemble Transformer architectures, achieving high accuracy, as shown in Figures 1 and 2. However, Feedback 2.0 models exhibited bias and moderately lower accuracy for historically marginalized students such as English Language Learners, as shown in Table 1.

Significance
The Feedback Prize project highlights the importance of efficient machine learning models and the potential for accurate discourse prediction in argumentative writing. The algorithms can be used for user-facing features that provide immediate, formative feedback to students and assist teachers in evaluating essays. They can also inform targeted interventions and promote educational equity for historically marginalized students. Overall, the PERSUADE corpus serves as a valuable resource for enhancing AWE systems and understanding argumentative writing effectiveness. Ensuring fairness and mitigating bias are also crucial considerations for the advancement of AWE systems in education.

Authors