Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer
Vol 2 No 1 (2018): Januari 2018

Identifikasi Kesalahan Penulisan Kata (Typographical Error) pada Dokumen Berbahasa Indonesia Menggunakan Metode N-gram dan Levenshtein Distance

Arina Indana Fahma (Fakultas Ilmu Komputer, Universitas Brawijaya)
Imam Cholissodin (Fakultas Ilmu Komputer, Universitas Brawijaya)
Rizal Setya Perdana (Fakultas Ilmu Komputer, Universitas Brawijaya)



Article Info

Publish Date
08 Aug 2017

Abstract

Text is one of communication and information media in human life. The crucial thing in text writing is a mistake in word writing called typographical error. The error occurs while using the keyboard on computer or on smartphone. Typographical error on a text can lead to something unpredictable for some people. Based on that reason, a system is needed to identify typographical error in a text and also make the correction of the error word. N-gram and Levenshtein Distance method can be used for correcting typographical error in the text. For detecting how many word candidates of typographical error, Levenshtein Distance can be implemented. Because the word candidates are unsorted, N-gram method is using to sort those word candidates based on the value of cosine similarity. In this research, the reason N-gram method using N=2 is to separated each two characters of identified typographical error and its word candidates.The value of cosine similarity calculated by tf-idf when the process of N-gram was done. The result of test scenario, the best value of precision is 0.97 from insertion type and the best value of recall is 1 from substitution type.

Copyrights © 2018






Journal Info

Abbrev

j-ptiik

Publisher

Subject

Computer Science & IT Control & Systems Engineering Education Electrical & Electronics Engineering Engineering

Description

Jurnal Pengembangan Teknlogi Informasi dan Ilmu Komputer (J-PTIIK) Universitas Brawijaya merupakan jurnal keilmuan dibidang komputer yang memuat tulisan ilmiah hasil dari penelitian mahasiswa-mahasiswa Fakultas Ilmu Komputer Universitas Brawijaya. Jurnal ini diharapkan dapat mengembangkan penelitian ...