AERA Annual Meeting: Automated Writing Trait Analysis

Information Menu
Search Tips

Navigation and Settings Menu
Change Preferences / Time Zone
Sign In

Social Media Menu
Facebook
X (Twitter)

Back Home

Refresh: Off

Paper Summary

Share...

Direct link:

Automated Writing Trait Analysis

In Event: Advances in Automated Feedback: Equity, Algorithms, Implementation, and Validation

Sun, April 14, 9:35 to 11:05am, Pennsylvania Convention Center, Floor: Level 100, Room 119B

Abstract

Purpose
Writing develops gradually, and it is therefore critical for teachers to be able to track student progress. Manual methods, such as curriculum-based measurement, tend to be labor-intensive and difficult to implement and scale. Automated writing evaluation (AWE) can be implemented at scale but may not provide sufficiently granular scores and feedback that make it possible to track growth and build instructionally useful student profiles. Ideally, such profiles would address both process (how students write) and product (what students write). Specifically, we investigated the use of natural language processing (NLP) and keystroke log analysis to create automated writing trait models.

Theoretical Framework
Multidimensional analysis was pioneered by Biber (1980) to support analysis of genre variation in large text corpora. Similar methods have been used to define writing traits in AWE systems (Attali & Powers, 2008) or to identify the latent dimensions of student writing processes (Leijten & Van Waes, 2013; Deane & Zhang, 2015). We extend these methods to define large multidimensional models that make it possible to characterize student variations in writing process, writing quality, and style.

Methods
We applied previously validated NLP and keystroke feature sets with known relationships to writing quality, text readability, and other extrinsic measures to large, preexisting corpora of student essays. Exploratory and confirmatory factor analysis on this data motivated a 17-trait model of student essay variation and a 4-trait model of student writing behaviors. We used structural equation modeling (SEM) to examine differences between student writing profiles before and after instruction and identify differential impacts of demographic variables.

Data
Multidimensional models were trained and validated against several preexisting corpora, including a 1.4 million-essay corpus of student essay submissions to a digital writing service. Effects of instruction were examined in a follow-up study. Participants were 844 seventh grade students from three public middle schools in a mid-Atlantic state. 44.7% of these students were African-American, 26.1% were White, and 18.2% Hispanic. 3% were English language learners (ELs) and 8.8%, special education students.

Results
The SEM growth model had high fit statistics (CFI=.905, RMSEA=.05). Most traits in our multidimensional model had significant loadings on pretest and change score factors. The change score factor indicated overall growth in composition fluency, organization, formal language, and cohesion. One school showed significantly stronger growth in overall writing scores and in several specific traits, including cohesion, sentence length, and time spent pausing between sentences. There were also significant differences in trait profiles associated with specific demographic groups. African-American students showed stronger-than-average growth in academic language, English language learners showed increases in composition fluency, accompanied by shifts toward written, as opposed to oral, style. Special education students showed increases in sentence length and complexity.

Significance
This study illustrates the potential of multidimensional writing trait models to support automated growth modeling for writing. Such models have the potential to provide schools and instructors with actionable information about student progress, by highlighting specific changes in what and how students write.

Author

Paul D. Deane, Educational Testing Service