Journal of Information System
Vol 11 No 1 (2026): (Desember 2025 - Mei 2026)

Part-of-Speech Tagging Bahasa Jawa Menggunakan Model Pre-Trained Bidirectional Encoder Representation from Transformers

Ahmad Izzuddin (Universitas Panca Marga)
Nuzul Hikmah (Universitas Panca Marga)
Muhammad Alvin Ajry (Panca Marga University)



Article Info

Publish Date
29 May 2026

Abstract

Part-of-Speech Tagging (POS tagging) is the process of determining word classes in a text that is important in natural language processing. In Javanese, POS tagging is still a challenge due to limited linguistic resources and the complexity of the language. With the development of deep learning technology, the BERT (Bidirectional Encoder Representations from Transformers) fine-tuning method has been applied to classify word classes in Javanese, which is a language with limited resources. The javanese-bert-small model was trained using the UD_Javanese-CSUI dataset, and evaluated using precision, recall, F1-score, and accuracy metrics. The results showed that the model achieved good performance with an accuracy of 88,87%, and showed stability during training without significant overfitting. These findings indicate that the BERT-based approach is effective in handling word class ambiguity in Javanese and can be a stepping stone for further development in NLP systems for regional languages.

Copyrights © 2026