The increasing adoption of online learning systems or e-learning in higher education brings consequences in the form of challenges in maintaining the originality of students' academic work (assignments, reports, theses, and others). One form of violation that is difficult to monitor is intra-class collusion. This study aims to evaluate the effectiveness of three lexical similarity algorithms, namely the Jaccard Index, Levenshtein Distance, and Cosine Similarity, to build an efficient automatic plagiarism detection instrument. The selection of the lexical method is based on the need for low computational resource consumption compared to complex meaning-based/word embedding methods, making it highly relevant for real-time implementation on an LMS platform. The research dataset consists of 854 student discussion responses taken from two different courses at an institution that implements full e-learning. The research stages include text pre-processing, similarity score calculation, and threshold optimization to balance false positive and false negative rates. Experimental results show that Levenshtein Distance provides the best performance with an F2-Score of 0.81044 at a threshold of 0.45. This value indicates high sensitivity in capturing variations in text manipulation in digital learning environments. This research provides a theoretical and practical foundation for institutions in developing lightweight yet accurate academic integrity monitoring features.
Copyrights © 2026