MATRIK : Jurnal Manajemen, Teknik Informatika, dan Rekayasa Komputer
Vol. 25 No. 2 (2026)

Comparative Analysis of Indonesian Pre-trained BERT Models for the Extractive Question Answering Task on an Indonesian-Translated SQuAD Dataset

Suhendra, Fattah Al Ilmi (Unknown)
Darmayantie, Astie (Unknown)
Suhendra, Adang Suhendra (Unknown)
Pa Pa Min (Unknown)



Article Info

Publish Date
11 Mar 2026

Abstract

Transformer-based architectures have significantly advanced Natural Language Processing (NLP), with Bidirectional Encoder Representations from Transformers (BERT) serving as a strong baseline for extractive Question Answering (QA). This study aims to evaluate the performance of Indonesian BERT models on extractive QA tasks and to identify the most effective model for low-resource language settings. This research employed a comparative experimental method using two Indonesian BERT variants: indobert-base- ncased (IndoLEM) and indobert-base-p1 (IndoNLU/IndoBenchmark). Both models were fine-tuned on an Indonesian version of SQuAD 2.0, automatically translated via the Google Translate API. Answer-span alignment errors caused by translation were corrected using fuzzy string matching. Evaluation was conducted under identical hyperparameter settings and training schemes, using Exact Match (EM) and F1-score as performance metrics. The results indicate that IndoLEM achieved superior performance, with better loss convergence and a higher F1-score (71.58) than IndoNLU (63.59), and the difference was statistically significant (p < 0.001). In conclusion, IndoLEM is a more effective baseline model for Indonesian extractive QA systems. The findings also demonstrate that the composition and scale of pre-trained corpora substantially influence model performance in low-resource language contexts and highlight the importance of transfer learning for advancing NLP in underrepresented languages.

Copyrights © 2026






Journal Info

Abbrev

matrik

Publisher

Subject

Computer Science & IT

Description

MATRIK adalah salah satu Jurnal Ilmiah yang terdapat di Universitas Bumigora Mataram (eks STMIK Bumigora Mataram) yang dikelola dibawah Lembaga Penelitian dan Pengabadian kepada Masyarakat (LPPM). Jurnal ini bertujuan untuk memberikan wadah atau sarana publikasi bagi para dosen, peneliti dan ...