JURIKOM (Jurnal Riset Komputer)
Vol 9, No 2 (2022): April 2022

Penerapan Algoritma Winnowing dan Word-Level Trigrams Untuk Mengidentifikasi Kesamaan Kata

Rezki Ramdhani (Universitas Ahmad Dahlan, Yogyakarta)
Abdul Fadlil (Universitas Ahmad Dahlan, Yogyakarta)
Sunardi Sunardi (Universitas Ahmad Dahlan, Yogyakarta)



Article Info

Publish Date
29 Apr 2022

Abstract

Identifying the same words in two or more texts is the first step in the process of detecting plagiarism. Plagiarism detection software are commercially available but relatively expensive. Although some software is offered for free, the features provided are very limited. Therefore, a word similarity detection system is needed to be used as an alternative for users that can be freely accessed. The application of the pattern matching method is one of the solutions that can be used to find the similarity of words between documents. There are several algorithms that can be used as a method to find the similarity of words in the text, including the Winnowing algorithm which is known to have good performance in detecting similarity of words. Winnowing is a hashing-approach based algorithm that applies hash-function and window formation to obtain fingerprints during pattern matching. Based on these fingerprints, the word similarity level can be calculated. Previous studies have only calculated the level of similarity of words based on the character (character-level), while the calculation of the level of similarity based on words (word-level) is still limited. This research was carried out with the aim of measuring the level of similarity of words using the Winnowing algorithm and word-level trigrams. The results showed that the Winnowing algorithm which was applied using word-level trigrams could detect similarities in the text of 76.84%, 52.29%, 37.40%, and 19.29%, respectively. From the results of the study, it can be concluded that the pattern matching method with the Winnowing algorithm and word-level trigrams can be used to measure the level of similarity of the text

Copyrights © 2022






Journal Info

Abbrev

jurikom

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering

Description

JURIKOM (Jurnal Riset Komputer) membahas ilmu dibidang Informatika, Sistem Informasi, Manajemen Informatika, DSS, AI, ES, Jaringan, sebagai wadah dalam menuangkan hasil penelitian baik secara konseptual maupun teknis yang berkaitan dengan Teknologi Informatika dan Komputer. Topik utama yang ...