Claim Missing Document
Check
Articles

Found 1 Documents
Search

Identifikasi Bahasa Daerah di Indonesia Dengan Multinomial Naïve Bayes Ardhianda, Maulana Muhammad; Ardiyanti, Suryani Arie
eProceedings of Engineering Vol. 11 No. 4 (2024): Agustus 2024
Publisher : eProceedings of Engineering

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Currently, there has been a lot of research that has carried out language identification, but not many results have been provided for identifying regional languages in Indonesia. For this reason, this research will discuss the identification of local languages in Indonesia using seven languages, namely, Indonesian, Javanese, Sundanese, Minang, Muna, Bugis and Madurese. The approach used to identify languages in this research uses the Multinomial Naïve Bayes method. This approach is used to calculate the probability of each word pattern or row of words appearing in a labeled sentence. The resulting probability model is then used to determine the class of new sentences for which the language will be determined. The performance of this language identification method is measured by conducting two test scenarios. The first test was to find out the effect of n-gram pattern on the F-measure, while the second test was to observe the effect of the amount of training data on the F-measure. The test results show that the unigram and bigram patterns provide the highest accuracy results of 98.86%. As for the amount of training data of 1500 sentences for each language, it shows an accuracy of 98%. Keywords: language identification, local languages, multinomial naïve bayes