Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Journal of Information Technology and Cyber Security

Dialect Classification of the Javanese Language Using the K-Nearest Neighbor Filby, Brilliant; Pujianto, Utomo; Hammad, Jehad A. H.; Wibawa, Aji Prasetya
Journal of Information Technology and Cyber Security Vol. 2 No. 2 (2024): July
Publisher : Department of Information Systems and Technology, Faculty of Intelligent Electrical and Informatics Technology, Universitas 17 Agustus 1945 Surabaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30996/jitcs.12213

Abstract

Indonesia is rich in ethnic and cultural diversity, each reflected in its unique linguistic characteristics. One way to preserve the Javanese language is by conducting research on its dialects. This study aims to classify three main dialects in Java Island—East Java, Central Java, and West Java—using text data from online sources. The classification process includes preprocessing (tokenizing, case folding, and word weighting), data balancing with the Synthetic Minority Oversampling Technique (SMOTE), and classification using the K-Nearest Neighbor (K-NN) algorithm. This study highlights the importance of dialect recognition in supporting the preservation of the Javanese language and the development of linguistic technology applications. Testing using 10-fold cross-validation showed the best performance at , with an accuracy of 94.05%, precision of 95.83%, and recall of 94.44%. These findings significantly support computational linguistics research and the preservation of regional languages.