eProceedings of Engineering
Vol. 11 No. 4 (2024): Agustus 2024

Identifikasi Bahasa Daerah di Indonesia Dengan Multinomial Naïve Bayes

Ardhianda, Maulana Muhammad (Unknown)
Ardiyanti, Suryani Arie (Unknown)



Article Info

Publish Date
21 Oct 2024

Abstract

Currently, there has been a lot of research that has carried out language identification, but not many results have been provided for identifying regional languages in Indonesia. For this reason, this research will discuss the identification of local languages in Indonesia using seven languages, namely, Indonesian, Javanese, Sundanese, Minang, Muna, Bugis and Madurese. The approach used to identify languages in this research uses the Multinomial Naïve Bayes method. This approach is used to calculate the probability of each word pattern or row of words appearing in a labeled sentence. The resulting probability model is then used to determine the class of new sentences for which the language will be determined. The performance of this language identification method is measured by conducting two test scenarios. The first test was to find out the effect of n-gram pattern on the F-measure, while the second test was to observe the effect of the amount of training data on the F-measure. The test results show that the unigram and bigram patterns provide the highest accuracy results of 98.86%. As for the amount of training data of 1500 sentences for each language, it shows an accuracy of 98%. Keywords: language identification, local languages, multinomial naïve bayes

Copyrights © 2024






Journal Info

Abbrev

engineering

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Engineering Industrial & Manufacturing Engineering

Description

Merupakan media publikasi karya ilmiah lulusan Universitas Telkom yang berisi tentang kajian teknik. Karya Tulis ilmiah yang diunggah akan melalui prosedur pemeriksaan (reviewer) dan approval pembimbing ...