Journal of Applied Artificial Intelligence in Education
Vol 2, No 1 (2026): July 2026

Auditable Automated Essay Scoring and Formative Feedback: A Rubric-Grounded Pipeline for Secondary and Higher Education

Qi Xin (University of Pittsburgh)



Article Info

Publish Date
01 Apr 2026

Abstract

Automated essay scoring in education is increasingly expected to do more than reproduce human holistic scores; classroom use also demands rubric-aligned feedback, transparent evidence, and a way to route uncertain cases to teachers. In this study, “LLM-ready” refers to a system that outputs structured score evidence, weak-trait signals, and document-level anchors that can later be verbalized by a language model without changing the underlying decision trace. This study aimed to evaluate whether a rubric-grounded, LLM-ready pipeline can achieve competitive scoring accuracy while also generating auditable formative feedback and a teacher-controllable review signal. The evaluation used the public ASAP corpus of 12,976 essays across eight prompts and prompt-wise five-fold cross-validation. Four holistic scorers were compared: length-only, rubric forest, prompt-adaptive centroid regressor (PACR), and the final RG-Score ensemble with trait grounding, isotonic calibration, and audit control. Auxiliary analytic scoring was examined on Prompts 2 and 7–8, and feedback experiments were conducted on all 2,292 essays from Prompts 7 and 8. PACR obtained the highest macro QWK of 0.739, while RG-Score reached 0.738 and provided a calibrated, auditable path to feedback. The prompt-level QWK for RG-Score ranged from 0.66 to 0.82, with particularly strong gains on Prompts 6 and 7. Auxiliary analytic scoring yielded QWK values of 0.623 for Prompt 2 domain2, 0.604 on average for Prompt 7 traits, and 0.506 on average for Prompt 8 traits. The rubric-grounded evidence feedback template achieved a Trait Recall@2 of 0.829, a valid evidence rate of 0.912, and an auditability index of 0.893 on Prompts 7 and 8. These findings support rubric-grounded AES as a practical assessment-support approach for secondary-school writing and as a structured foundation for higher-education formative feedback workflows, while also indicating that weaker trait models should be treated as advisory rather than fully autonomous.

Copyrights © 2026






Journal Info

Abbrev

JAAIE

Publisher

Subject

Computer Science & IT Education

Description

Applied AI in Classroom Practice, exploring practical classroom implementations such as smart content delivery, AI-powered virtual assistants, and automated learning support tools. Intelligent Tutoring Systems, focusing on adaptive AI-driven systems that personalize instruction based on individual ...