Techno.Com: Jurnal Teknologi Informasi
Vol. 25 No. 2 (2026): May 2026

A Systematic Evaluation of BERT Classifiers for Indonesia-based Text Data

Yogie Oktavianus Sihombing (Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia)
Khusnul Muchlisin (Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia)
Tri Fidrian Arya (Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia)
Moh. Jabir Mubarok (Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia)
Reza Fuad Rachmadi (Institut Teknologi Sepuluh Nopember, Surabaya, Indonesia)



Article Info

Publish Date
28 May 2026

Abstract

This study presents a systematic evaluation of Indonesian BERT models across multiple natural language processing (NLP) tasks, including named entity recognition (NER), sentiment analysis (SA), emotion classification (EmoT), and hate speech detection (HS). Unlike prior studies that primarily focus on effectiveness metrics, this work incorporates both effectiveness (F1-Macro and accuracy) and efficiency (training time and memory usage) to provide a more comprehensive benchmark. Experimental results show that IndoRoBERTa achieves the highest overall F1-Macro (0.826), indicating strong generalization across tasks, while IndoNLU attains the highest accuracy (0.833), suggesting better performance on dominant classes. IndoLEM demonstrates superior efficiency with the lowest training time (988.68 seconds) and minimal GPU memory usage (4.00 GB), making it suitable for resource-constrained environments. In contrast, the multilingual mBERT model exhibits higher computational cost with comparatively lower efficiency. The findings highlight a trade-off between performance and computational efficiency, where monolingual Indonesian models consistently outperform multilingual models in both effectiveness and resource utilization. These results provide practical insights for selecting appropriate pretrained language models based on task requirements and computational constraints in Indonesian NLP applications.     Keywords - BERT; Indonesian NLP; model efficiency; multi-task evaluation

Copyrights © 2026






Journal Info

Abbrev

technoc

Publisher

Subject

Computer Science & IT Engineering

Description

Topik dari jurnal Techno.Com adalah sebagai berikut (namun tidak terbatas pada topik berikut) : Digital Signal Processing, Human Computer Interaction, IT Governance, Networking Technology, Optical Communication Technology, New Media Technology, Information Search Engine, Multimedia, Computer Vision, ...