Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Indonesian Journal of Electrical Engineering and Computer Science

An arrangement of the number of K-grams in the performance of Rabin Karp algorithm in text adjustment Yuli Astuti; Irma Rofni Wulandari
Indonesian Journal of Electrical Engineering and Computer Science Vol 26, No 3: June 2022
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v26.i3.pp1388-1394

Abstract

Rabin Karp algorithm is frequently used to determine the similarity between texts, using the hash function to compare the string identified and the substring in the text. The choice of the k value in the K-gram is often unrestricted. The number of k values used when cutting some terms will take longer if tried one by one. This research will perform a word cutting test on a script using K-gram 0 to 8. The results will cover the effect of the value of each K used on the similarity percentage produced. This research aims to determine the effect of the number of K-grams on the performance of Rabin Karp in text matching. The test underwent 20 sentences and 10 times using the dice coefficient for text similarity testing. The conclusion of this research should not use the K-gram 0 to 2 due to the K-gram basic principle: character deduction. Subsequently, if the character is 0,1,2, it does not have a meaning yet; thus, it gets a high similarity percentage. Based on trials by taking samples of K-gram 0 to 8 from 10 test data sets; the K-gram 3 is the best among K-grams 0 to 8.