Claim Missing Document
Check
Articles

Found 3 Documents
Search

SIMPLE: System Automatic Essay Assessment for Indonesian Language Subject Examination Ratna, Anak Agung Putri; Budiardjo, Bagio; Hartanto, Djoko
Makara Journal of Technology Vol. 11, No. 1
Publisher : UI Scholars Hub

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

SIMPLE: System Automatic Essay Assessment for Indonesian Language Subject Examination. Evaluation of study of a student is a very important aspect of an educational process. Evaluation is aimed at measuring the level of student understanding of the given lecture materials. Measuring student understanding of the course material, using essay-type exam, is generally used as the evaluation tool. In this essay-type exam, the student has to answer questions using sentences, whereby choices of possible answers are not indicated. The student has to answer the questions with his/her sentences. The answers may vary, since it reflects the student’s best thoughts of the materials. One of the weaknesses of essay-type exam is the difficulty to grade the answers and it tends to be time consuming. Currently, automatic grading systems that may speed up the grading process, are being developed in many research institutions. The method used to grade, varies form one system to another, and one of the popular system is the Latent Semantic Analysis (LSA). LSA is a method of grading essay by extracting words and representing the sentence in the form of mathematical or statistical formulation, from a text with a relatively large number of words. The grade of the essay is determined, by matching the important words to a group of words prepared by the human rater. This paper describes an effort to developed LSA, enhanced with word weighting, word order and the word synonym to improve the accuracy of grading. This system is called SIMPLE. SIMPLE is used to grade answers using bahasa Indonesia. The exam is carried out on-line through the Web. From the experiments conducted, for small classes, the conformity of grade compared to the grade of human rater lies between 69.80 % – 94.64 %, and for medium size classes the conformity lies between 77.18 % - 98.42 % with the human rater. These results are roughly proportional with the result of LSA system, which grade essay given in English.
DEVELOPMENT AND OPTIMIZATION OF SIMPLE-O: AN AUTOMATED ESSAY SCORING SYSTEM FOR JAPANESE LANGUAGE BASED ON BERT, BILSTM, AND BIGRU Junaidi, Muhammad Aidan Daffa; Ratna, Anak Agung Putri
International Journal of Electrical, Computer, and Biomedical Engineering Vol. 3 No. 3 (2025)
Publisher : Universitas Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62146/ijecbe.v3i3.100

Abstract

This study aims to develop an Automatic Essay Scoring System (SIMPLE-O) for Japanese essays, consisting of five short essay questions. SIMPLE-O is designed to enhance scoring accuracy by leveraging deep learning models such as BERT, BiLSTM, and BiGRU. The research evaluates deep-level score predictions for each question, rather than only considering the total score across the five questions, to provide more reliable and accurate assessments. SIMPLE-O compares student responses with three predefined answer keys using two similarity measurement methods: Cosine Similarity and Manhattan Distance. The study employs datasets developed through data augmentation techniques applied to lecturer and student responses. The system is implemented using Python, and its performance is evaluated through analyses of various architectures based on specified hyperparameters. The best results were achieved using a BERT-BiLSTM architecture with the Cosine Similarity method, configured with a batch size of 8, 256 hidden state units, a learning rate of 0.00001, and 100 epochs. The evaluation demonstrated that this approach achieved a Mean Absolute Percentage Error (MAPE) of 7.230% and an average score difference of 5.689. This research highlights the potential of SIMPLE-O for automated scoring of Japanese essays, offering improved accuracy, reliability, and deeper analytical insights.
Defying Data Scarcity: High-Performance Indonesian Short Answer Grading via Reasoning-Guided Language Model Fine-Tuning Faza, Muhammad Naufal; Purnamasari, Prima Dewi; Ratna, Anak Agung Putri
International Journal of Electrical, Computer, and Biomedical Engineering Vol. 3 No. 3 (2025)
Publisher : Universitas Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62146/ijecbe.v3i3.148

Abstract

Automated Short Answer Grading (ASAG) is crucial for scalable feedback, but applying it to low-resource languages like Indonesian is challenging. Modern Large Language Models (LLMs) severely overfit small, specialized educational datasets, limiting utility. This study compares nine traditional machine learning models against two fine-tuning strategies for Gemma-3-1b-it on an expanded Indonesian ASAG dataset (n=220): (a) standard fine-tuning predicting only scores, and (b) a proposed reasoning-guided approach where the model first generates a score rationale using knowledge distillation before predicting the score. The reasoning-guided model (Gemma-3-1b-ASAG-ID-Reasoning) achieved state-of-the-art performance (QWK 0.7791; Spearman’s 0.8276), significantly surpassing the best traditional model in this study (SVR, QWK 0.6952). This work advances foundational LSA-based approaches for this task by introducing a more robust methodology and evaluation framework. Crucially, standard fine-tuning (Gemma-3-1b-ASAG-ID) suffered catastrophic overfitting (QWK 0.7279), indicated by near-perfect training but poor test scores. While the reasoning-guided LLM showed superior accuracy, it required over 35 times more inference time. Results demonstrate that distilled reasoning acts as a powerful regularizer, compelling the LLM to learn underlying grading logic rather than memorizing pairs, establishing a viable method for high-performance ASAG in data-scarce environments despite computational trade-offs.