Jurnal Teknologi Sistem Informasi dan Sistem Komputer TGD
Vol. 8 No. 1 (2025): J-SISKO TECH EDISI JANUARI

Combination of TF-IDF and Rabin-Karp for Detecting Document Similarity in Student Thesis Abstracts

Saputro, Pujo Hari (Unknown)
Pontoh, Fransisca Joanet (Unknown)
Tumurang, Olivia Maria (Unknown)



Article Info

Publish Date
25 Jan 2025

Abstract

Final semester students are required to complete a final project in the form of research relevant to their respective fields of study, to find innovative solutions, and to develop critical thinking skills. However, plagiarism is a common problem that often arises. Plagiarism is defined as the act of taking someone else's work, including opinions, and claiming it as one's own. Therefore, technology can be used to detect similarities in the abstracts of student manuscripts submitted during thesis title submissions, allowing for early detection of plagiarism. The corpus used was taken from the directory of final projects from the Computer Engineering Study Program, consisting of 98 data points, and from the Civil Engineering Study Program, consisting of 40 data points. In this study, utilizing the TF-IDF and Rabin-Karp algorithms, it was found that TF-IDF is capable of detecting the importance of a word in a document relative to the entire corpus. Rabin-Karp has also proven effective in detecting matching patterns in several corpuses, with a known pattern matching accuracy of 70%.

Copyrights © 2025






Journal Info

Abbrev

jsk

Publisher

Subject

Computer Science & IT

Description

Bioinformatics/Biomedical Applications Biometrical Application Computer Network and Architecture Computer Vision Content-Based Multimedia Retrievals Information System Data analysis Fuzzy Logic Genetic Algorithm High Performance Computing Image Processing Information Retrieval Information Security ...