Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Addressing Extreme Class Imbalance in Multilingual Complaint Classification Using XLM-RoBERTa Ariyanto, Muhammad; Alzami, Farrikh; Sani, Ramadhan Rakhmat; Gamayanto, Indra; Naufal, Muhammad; Winarno, Sri; Iswahyudi
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.11606

Abstract

Government complaint management systems often suffer from extreme class imbalance, where a few public service categories accumulate most reports while many others remain under-represented. This research examines whether simple class weighting can improve fairness in multilingual transformer models for automatic routing of Indonesian citizen complaints on the LaporGub Central Java e-governance platform. The dataset comprises 53,877 Indonesian-language complaints spanning 18 service categories with an imbalance ratio of about 227:1 between the largest and smallest classes. After cleaning and deduplication, we stratify the data into training, validation, and test sets. We compare three approaches: (i) a linear support vector machine (SVM) with term frequency inverse document frequency (TF-IDF) unigram and bigram and class-balanced weights, (ii) a cross-lingual RoBERTa (XLM-RoBERTa-base) model without class weighting, and (iii) an XLM-RoBERTa-base model with a class-weighted cross-entropy loss. Fairness is operationalised as equal importance for categories and quantified primarily using the macro-averaged F1-score (Macro-F1), complemented by per-class F1, weighted F1, and accuracy. The unweighted XLM-RoBERTa model outperforms the SVM baseline in Macro-F1 (0.610 vs 0.561). The class-weighted variant attains similar Macro-F1 (0.608) while redistributing performance towards minority categories. Analysis shows that class weighting is most beneficial for categories with a few hundred to several thousand samples, whereas extremely rare categories with fewer than 200 complaints remain difficult for all models and require additional data-centric interventions. These findings demonstrate that multilingual transformer architectures combined with simple class weighting can provide a more balanced backbone for automated complaint routing in Indonesian e-government, particularly for low- and medium-frequency service categories.
Co-Authors Abdul Azis Abdullah, Salsabilla Putri Kinanti Ade Novita Adnan Agusman Ahmad Rasyid Ainul Mardiyah Alfisyahrin, Alfisyahrin Almeida, Maria Alzami, Farrikh Anna Fitriani Arief Diana, Muhammad Ariyani, Putri Sukmawati ARIYANTO, MUHAMMAD Asmaul Husna, Bayu Andrian berutu, prekdisampangate Boy Riza Juanda Cita Rosita Sigit Prakoeswa Cut Mulyani Danar Hadisugelar Daniel Happy Putra Dea Musvita dea, Dea Musvita Dewie Anatasya Karno DINAR ADRIATY Erna Yayuk Fouzan A'zim Ganda Elsandro Tumanggor Garfansa, Marchel Putra Haria, Novi Gabriella Harsono Helmi Helmi Holifah, Siti Humairah, Jasri Fanny Husnul Hotima Indra Gamayanto Islamudin Ahmad Izaar, Ahmad Shohibul Juanda,, Boy Riza Khairan Khairan Laura Navika Yamani Layli Pitri Yani Maryam Jamila Arief MASRUHIM, MUHAMMAD AMIR Maulida Medhi Denisa Alinda, Medhi Denisa Michel Kasaf Muhammad Faisal Muhammad Khadafi Muhammad Naufal, Muhammad Muliyani Mulyani, Cut Nainggolan, Nurhayati Nasrullah, M Alaika Naurah, Gladdays Novermawati Novida Dwici Yuanri Manik Nur Zakiyah Darajat NURHALIMAH Pratiwi, Septi Dwi Puput Ade Wahyuningtyas Rahmayanti Ramadhan Rakhmat Sani Ratna Wahyuni Reji Maulana Azsari Risyad, Syukri Rizki Afandi Rosmaiti, Rosmaiti Rosmiati Safrizal Saiful Mahdi Salsabila Santoso, Juli Saputra, Iwan Saputra, Iwan Saragih , Yasmirah Mandasari Sembiring, Tamaulina Br. Siregar, Dolly S. Siregar, Maruli Sitanggang, Ferawati Siti Arieta Sri Winarno Sugiharto, Akip Sutarni Suyatmini Syukri Tampubolon, Elita Romei Tania, Putri Udin Safala Vina Maulidya Wijaya, Panca Dharma Yenni Arista Cipta Ekalaturrahmah Yenni Marnita