Search
On-Site Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Change Preferences / Time Zone
Sign In
Bluesky
Threads
X (Twitter)
YouTube
High-quality, interpretable, and generalizable scoring is essential for automated essay scoring (AES) in educational situations. However, existing AES face two primary challenges: (a) scoring models lack procedural interpretability, and (b) the models are mostly evaluated on single datasets, limiting generalization across languages and situation. This paper proposes an IRT-enhanced AES framework (IRT-AESF) to address these issues. IRT-AESF consists of three modules: a pre-trained language model encodes essay text, a multi-layer perceptron (MLP) correlates the semantic representation with a latent trait score indicative of student writing ability, and an IRT-based scoring module predicts the score while estimating and presenting significant psychometric parameters. Experiments conducted on three large-scale, multilingual, real-world essay datasets demonstrate that IRT-AESF significantly outperforms baseline models.