Indonesian Journal of Electrical Engineering and Computer Science
Vol 38, No 3: June 2025

Plagiarism detection using text-representing centroids techniques

Nualnim, Sureeporn (Unknown)
Maliyaem, Maleerat (Unknown)
Unger, Herwig (Unknown)



Article Info

Publish Date
01 Jun 2025

Abstract

This study addresses the limitations of traditional plagiarism detection methods by introducing the text-representing centroid (TRC) technique. TRC is designed to improve the accuracy of detecting semantic similarities and sophisticated forms of plagiarism. It utilizes a co-occurrence graph to identify centroid terms that represent the core meaning of text documents, effectively capturing the contextual associations between terms. Extensive experiments were conducted on a dataset of academic papers to assess TRC’s performance against traditional techniques across various categories of plagiarism, including near-copy, modified-copy, and paraphrasing. The results demonstrate the effectiveness of the TRC technique, achieving an average precision of 0.96 and a recall of 0.71. This performance surpasses methods such as Jaccard and Cosine similarity in accurately detecting more, complex forms of plagiarism. These findings highlight TRC’s potential as a robust tool for both academic and industry applications, helping to ensure integrity in textual content through precise and comprehensive plagiarism detection.

Copyrights © 2025