Search
On-Site Program Calendar
Browse By Day
Browse By Time
Browse By Person
Browse By Room
Browse By Unit
Browse By Session Type
Search Tips
Change Preferences / Time Zone
Sign In
Bluesky
Threads
X (Twitter)
YouTube
This study introduces a three-stage automated essay scoring framework leveraging large language models (LLMs) to enhance the accuracy, consistency, and scalability of Chinese essay evaluation. Applied to 57 annotated junior secondary essays, the framework integrates DeepSeek-R1 for grade classification and Qwen3-14B for pairwise ranking and scoring with benchmark adjustments. Results indicate strong alignment with human raters (r = 0.827; QWK = 0.806) and superior performance over direct LLM scoring. This work contributes methodologically to Automated Essay Scoring (AES) by offering a structured, interpretable approach and highlights practical applications in supporting teachers and large-scale assessments.