This study addresses the limitations of traditional plagiarism detection methods by introducing the text-representing centroid (TRC) technique. TRC is designed to improve the accuracy of detecting semantic similarities and sophisticated forms of plagiarism. It utilizes a co-occurrence graph to identify centroid terms that represent the core meaning of text documents, effectively capturing the contextual associations between terms. Extensive experiments were conducted on a dataset of academic papers to assess TRC’s performance against traditional techniques across various categories of plagiarism, including near-copy, modified-copy, and paraphrasing. The results demonstrate the effectiveness of the TRC technique, achieving an average precision of 0.96 and a recall of 0.71. This performance surpasses methods such as Jaccard and Cosine similarity in accurately detecting more, complex forms of plagiarism. These findings highlight TRC’s potential as a robust tool for both academic and industry applications, helping to ensure integrity in textual content through precise and comprehensive plagiarism detection.
Copyrights © 2025