Claim Missing Document
Check
Articles

Found 1 Documents
Search

Development of Automated Essay Scoring Using Retrieval Augmented Generation in SAGE I Gusti Nyoman Sapta Wiguna; Ida Bagus Nyoman Pascima; Luh Putu Eka Damayanthi
RIGGS: Journal of Artificial Intelligence and Digital Business Vol. 5 No. 2 (2026): Mei-Juli
Publisher : Prodi Bisnis Digital Universitas Pahlawan Tuanku Tambusai

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31004/riggs.v5i2.8979

Abstract

Academic assessment through essay questions is a fundamental component of the educational ecosystem designed to measure students’ cognitive depth and critical reasoning. However, manual essay grading presents significant pedagogical and administrative challenges including high susceptibility to subjective bias and an overwhelming workload for educators. To address these critical issues this research develops the Smart Automated Grading Engine or SAGE, an advanced Learning Management System engineered to automate qualitative assessments. SAGE integrates large language models via the Gemini API with a robust Retrieval Augmented Generation architecture. By strictly grounding the artificial intelligence evaluation process in teacher curated reference documents and specific grading rubrics the system effectively neutralizes the risk of information hallucination. The system was empirically validated at SMAN 1 Blahbatuh involving 180 authentic essay responses from 36 eleventh grade students. The automated assessments were statistically compared against the manual evaluations of three expert history teachers. Comprehensive technical evaluations utilizing Black Box and White Box testing confirmed the platform absolute functional stability and architectural security. Crucially the accuracy testing demonstrated exceptional pedagogical reliability where the SAGE platform achieved a Quadratic Weighted Kappa coefficient of 0.9133 categorizing its performance as having almost perfect agreement. Furthermore, the system exhibited a remarkable precision rate of 94.44 percent within a stringent 10 point score tolerance. Ultimately the integration of this technology proves to be an effective objective and efficient solution capable of replicating human evaluation sharpness while significantly alleviating educator burnout.