This Author published in this journals
All Journal bit-Tech
Selena Nurmanina Afandy
Universitas Pembangunan Nasional "Veteran" Jawa Timur

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparison of the Effectiveness IndoBERT and mBERT for Sentiment Analysis of SME Customer Reviews Selena Nurmanina Afandy; Kartika Maulida Hindrayani; Aviolla Terza Damaliana
bit-Tech Vol. 8 No. 3 (2026): bit-Tech - IN PROGRESS
Publisher : Komunitas Dosen Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.32877/bt.v8i3.3501

Abstract

This study presents a structured comparative evaluation of IndoBERT and Multilingual BERT (mBERT) for three-class sentiment classification of customer reviews from Pawonkoe Banyuwangi, an Indonesian small and medium-sized enterprise (SME). Motivated by the limited transferability of IndoNLU-style benchmarks to real SME feedback, the central question is whether monolingual versus multilingual transformers remain reliable when fine-tuned on small, domain-specific, and operationally noisy datasets. A total of 365 survey-based reviews (January–December 2024), which is substantially smaller than typical transformer fine-tuning corpora, served as the empirical basis. Models were fine-tuned under matched hyperparameters and evaluated using a single stratified hold-out train–test split (not cross-validation), reporting accuracy, precision, recall, and F1-score. To reflect the deployed pipeline, mBERT additionally incorporates the original 1–5 rating as an auxiliary numeric signal alongside the review text, whereas IndoBERT is trained on text only. The results reveal a substantial performance gap: mBERT achieved 81% test accuracy, whereas IndoBERT reached 48% under the same evaluation setting. Because the label distribution is strongly imbalanced (with very few negative instances), these aggregate scores should be interpreted as overall effectiveness rather than minority-class robustness. Overall, the findings indicate that multilingual representations combined with auxiliary rating information can generalize more effectively in low-resource SME scenarios, while IndoBERT appears more sensitive to data scarcity in this context. The study offers practical guidance for model selection in resource-constrained Indonesian sentiment analytics and contributes evidence on transformer behavior beyond curated benchmarks.