Claim Missing Document
Check
Articles

Found 3 Documents
Search
Journal : JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Medical Named Entity Recognition from Indonesian Health-News using BiLSTM-CRF with Static and Contextual Embeddings Ignasius, Darnell; Novita Dewi , Ika; Bernadette Chayeenee Norman , Maria; Rakhmat Sani, Ramadhan
Journal of Applied Informatics and Computing Vol. 9 No. 6 (2025): December 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i6.11574

Abstract

Named Entity Recognition (NER) is vital for structuring medical texts by identifying entities such as diseases, symptoms, and drugs. However, research on Indonesian medical NER remain limited due to the lack of annotated corpora and linguistic resources. This scarcity often leads to difficulties in learning meaningful word representations, which are crucial for accurate entity identification. This research aims to compare the effectiveness of static and contextual embeddings in enhancing entity recognition on Indonesian biomedical text. The experimental setup involved utilizing both static (Word2Vec) and contextual (IndoBERT) embeddings in conjunction with neural architectures (BiLSTM) along with Conditional Random Fields (CRF). The BiLSTM architecture was selected for its ability to capture bidirectional dependencies in language sequences. Specifically, four models: Word2Vec-BiLSTM, Word2Vec-BiLSTM-CRF, IndoBERT-BiLSTM, and IndoBERT-BiLSTM-CRF were evaluated to assess the impact of contextual representations and structured decoding. The models were trained on a manually annotated DetikHealth corpus, where specific medical entities such as diseases, symptoms, and drugs were labeled with the BIO-tagging scheme. Performance was subsequently evaluated based on standard metrics: precision, recall, and F1-score. Results indicate that IndoBERT’s contextual embeddings significantly outperform static Word2Vec features. The IndoBERT-BiLSTM-CRF model achieved the highest performance micro-F1 0.4330, macro-F1 0.3297, with the Disease entity reaching an F1-score of 0.5882. Combining contextual embeddings with CRF-based decoding enhances semantic understanding and boundary consistency, demonstrating superior performance for Indonesian biomedical NER. Future work should explore domain-adaptive pretraining and larger biomedical corpora to further improve contextual accuracy.
Addressing Extreme Class Imbalance in Multilingual Complaint Classification Using XLM-RoBERTa Ariyanto, Muhammad; Alzami, Farrikh; Sani, Ramadhan Rakhmat; Gamayanto, Indra; Naufal, Muhammad; Winarno, Sri; Iswahyudi
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.11606

Abstract

Government complaint management systems often suffer from extreme class imbalance, where a few public service categories accumulate most reports while many others remain under-represented. This research examines whether simple class weighting can improve fairness in multilingual transformer models for automatic routing of Indonesian citizen complaints on the LaporGub Central Java e-governance platform. The dataset comprises 53,877 Indonesian-language complaints spanning 18 service categories with an imbalance ratio of about 227:1 between the largest and smallest classes. After cleaning and deduplication, we stratify the data into training, validation, and test sets. We compare three approaches: (i) a linear support vector machine (SVM) with term frequency inverse document frequency (TF-IDF) unigram and bigram and class-balanced weights, (ii) a cross-lingual RoBERTa (XLM-RoBERTa-base) model without class weighting, and (iii) an XLM-RoBERTa-base model with a class-weighted cross-entropy loss. Fairness is operationalised as equal importance for categories and quantified primarily using the macro-averaged F1-score (Macro-F1), complemented by per-class F1, weighted F1, and accuracy. The unweighted XLM-RoBERTa model outperforms the SVM baseline in Macro-F1 (0.610 vs 0.561). The class-weighted variant attains similar Macro-F1 (0.608) while redistributing performance towards minority categories. Analysis shows that class weighting is most beneficial for categories with a few hundred to several thousand samples, whereas extremely rare categories with fewer than 200 complaints remain difficult for all models and require additional data-centric interventions. These findings demonstrate that multilingual transformer architectures combined with simple class weighting can provide a more balanced backbone for automated complaint routing in Indonesian e-government, particularly for low- and medium-frequency service categories.
Exploring Public Opinion on the 'Makan Bergizi Gratis' Program on X: A Comparative Analysis of IndoBERT-Large and NusaBERT-Large Models Arunia, Aurelya Prameswari; Sani, Ramadhan Rakhmat; Dewi, Ika Novita; Sulistyono, MY Teguh
Journal of Applied Informatics and Computing Vol. 10 No. 1 (2026): February 2026
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v10i1.11757

Abstract

Program Makan Bergizi Gratis (MBG) has triggered extensive discourse on social media platform X, which serves as a primary space for public expression of opinions toward government policies. This study aims to analyze public sentiment toward the MBG program while simultaneously comparing the performance of two prominent Transformer-based models, namely IndoBERT-Large and NusaBERT-Large. This research adopts a quantitative approach employing supervised learning on 10,201 Indonesian-language posts (tweets) collected through web scraping from February 2024 to September 2025. A total of 2,000 samples were manually annotated as ground truth, achieving a high level of inter-annotator reliability (Cohen’s Kappa, κ = 0.81). The experimental results indicate that IndoBERT-Large outperforms NusaBERT-Large, achieving an accuracy of 83.00%, while NusaBERT-Large demonstrates competitive performance with an accuracy of 80.50%. Substantively, public discourse is dominated by negative sentiment, accounting for nearly 50% of the total data, reflecting public concerns regarding budgetary constraints and technical implementation issues. Positive sentiment ranges between 33% and 36%, indicating sustained and substantial public support for the program. These findings confirm the effectiveness of Transformer-based models in accurately capturing the dynamics of public opinion toward government policies using social media data.
Co-Authors ., Junta Zeniarza ., Junta Zeniarza Abdussalam Abdussalam, Abdussalam Abu Salam Ade Nurul Aisyah Agung Priyo Utomo, Rino Ahmad Khotibul Umam, Ahmad Khotibul Al zami, Farrikh Alzami, Farrikh Ardytha Luthfiarta ARIYANTO, MUHAMMAD Arta Moro Sundjaja, Arta Moro Arunia, Aurelya Prameswari Asih Rohmani Asih Rohmani Asih Rohmani, Asih Atha Rohmatullah, Fawwaz Bernadette Chayeenee Norman , Maria Budi Harjo Budi, Setyo Candra Irawan Catur Supriyanto Christy Atika Sari Darnell Ignasius Defri Kurniawan Defri Kurniawan Diana Aqmala Doheir, Mohamed Dwi Puji Prabowo, Dwi Puji Eko Hari Rachmawanto Elkaf Rahmawan Pramudya Erika Devi Udayanti Fahmi Amiq Farah Syadza Mufidah Farrikh Al Zami Farrikh Al Zami Fauzi Adi Rafrastara Fauzi Adi Rafrastara Florentina Esti Nilasari Florentina Esti Nilawati Guruh Fajar Shidik Hanny Haryanto Harun Al Azies Hercio Venceslau Silla Heru Lestiawan Hussein, Jasim Nadheer Hussein, Jassim Nadheer Ifan Rizqa Ignasius, Darnell Ika Novita Dewi Ikhwansyah Kurniawan Indra Gamayanto Iswahyudi ISWAHYUDI ISWAHYUDI Ivan Bayu Fachreza Junta Zeniarja Karin, Tan Regina Kiki Widia Kurniawan, Defri L. Budi Handoko Lekso Budi Handoko Maszuda, Akbar Alvian Megantara, Rama Aria Melati Anggreni Sitorus Muhammad Fais Ramadhani Muhammad Nabhan Rifa’i Muhammad Naufal, Muhammad MY. Teguh Sulistyono Nadya Azizah Nida Aulia Karima Novita Dewi , Ika Nugraha, Purwa Esti Pangesti, Galih Mentari Paramita, Cinantya Pergiwati, Dewi Priyo Utomo, Rino Agung Pulung Nurtantio Andono Purwanto Purwanto Ramadhani, Dwi Arya Resha Meiranadi Caturkusuma Rhyan David Levandra Ricardus Anggi Pramunendar Richard Emmerig S. Sukamto, Titien Salsabilla, Annisa Ratna Sarker, Md. Kamruzzaman Sasono Wibowo Sendi Novianto Sendi Novianto Sendi Novianto Setyo Budi Setyo Budi Sirait, Tamsir Hasudungan Soares, Gilardinho Javiere Oscoraldo Pedrosa Sri Winarno Sri Winarno Suharnawi Suharnawi Suharnawi Suharnawi Suharnawi Sukamto, Titien S. Sukamto, Titien Suhartini Sulistyono, Teguh Syahrizal, Muhammad Iqbal Titien Suhartini Sukamto Titien Suhartini Sukamto Utomo, Danang Wahyu Wibowo, Isro' Rizky Wildanil Ghozi Wulan Puspita Loka Yani Parti Astuti Yanuaresta, Dianna Yunita Ayu Pratiwi Yupie Kusumawati Zahro, Azzula Cerliana Zami, Farrikh Al