(JELIKU) Jurnal Elektronik Ilmu Komputer Udayana
Vol 8 No 3 (2020): JELIKU Volume 8 No 3, February 2020

Lemmatization in Balinese Language

Purnajiwa Arimbawa, I Gede Angga (Unknown)
Sanjaya ER, Ngurah Agus (Unknown)



Article Info

Publish Date
25 Jan 2020

Abstract

Lemmatization is a process to extracting root word from an affixed word with the aim of reducing variations of the word into the root word. Previous researches on extraction of root word in Balinese Language has been done with rule- based methods to remove affixes from words. The weakness of the rule-based method is that it must comply with the set of rules provided. However, writings in Balinese often contain typographical errors because speakers tend to write words according to how the word is spoken instead of following the correct rules. In this research, we apply the Levenshtein distance method to overcome the aforementioned shortcoming. After all the rules applied to a given word fail, the Leven- shtein distance method is used to list all words that are ”close”. Next, we select the closest word as the root word of the given input. Based on the experiments, our proposed method achieved an accuracy of 96.01 %.

Copyrights © 2020






Journal Info

Abbrev

JLK

Publisher

Subject

Computer Science & IT

Description

Aim and Scope: JELIKU publishes original papers in the field of computer science, but not limited to, the following scope: Computer Science, Computer Engineering, and Informatics Computer Architecture Parallel and Distributed Computer Computer Network Embedded System Human—Computer Interaction ...