This research aims to test the performance of the Cosine Similarity method in comparison with the Jaccard Similarity method and to obtain the percentage of similarity. Sample data is obtained from students' data at Budi Luhur campus. The test model will be evaluated by comparing several original theses with documents containing plagiarism. The original documents are processed using Natural Language Processing (NLP) methods. One important NLP method is the Jaro Winkler method, which focuses on spelling correction. Subsequently, text mining algorithms are applied for text processing. The results showed that the Cosine Similarity method achieved high accuracy, at 96.63%, demonstrating its ability to classify documents well as plagiarism or not. The use of Jaccard Similarity shows low accuracy, around 50.5%, but provides an overview of potential improvements or updates to the model to improve performance.Keywords - Cosine Similarity, Jaccard Similarity Thesis Classification, Threshold, NLP
Copyrights © 2024