IAES International Journal of Artificial Intelligence (IJ-AI)
Vol 14, No 5: October 2025

Automatic essay scoring: leveraging Jaccard coefficient and Cosine similarity with n-gram variation in vector space model approach

Dwi Cahyani, Andharini (Unknown)
Fathoni, Moh. Wildan (Unknown)
Rachman, Fika Hastarita (Unknown)
Basuki, Ari (Unknown)
Amin, Salman (Unknown)
Khotimah, Bain Khusnul (Unknown)



Article Info

Publish Date
01 Oct 2025

Abstract

Automated essay scoring (AES) is a vital area of research aiming to provide efficient and accurate assessment tools for evaluating written content. This study investigates the effectiveness of two popular similarity metrics, Jaccard coefficient, and Cosine similarity, within the context of vector space models (VSM) employing unigram, bigram, and trigram representations. The data used in this research was obtained from the formative essay of the citizenship education subject in a junior high school. Each essay undergoes preprocessing to extract features using n-gram models, followed by vectorization to transform text data into numerical representations. Then, similarity scores are computed between essays using both Jaccard coefficient and Cosine similarity. The performance of the system is evaluated by analyzing the root mean square error (RMSE), which measures the difference between the scores given by human graders and those generated by the system. The result shows that the Cosine similarity outperformed the Jaccard coefficient. In terms of n-gram, unigrams have lower RMSE compared to bigrams and trigrams.

Copyrights © 2025






Journal Info

Abbrev

IJAI

Publisher

Subject

Computer Science & IT Engineering

Description

IAES International Journal of Artificial Intelligence (IJ-AI) publishes articles in the field of artificial intelligence (AI). The scope covers all artificial intelligence area and its application in the following topics: neural networks; fuzzy logic; simulated biological evolution algorithms (like ...