Journal of Computer Science and Informatics Engineering
Vol 4 No 3 (2025): July

Comparative Performance of IndoBERT and IndoLEM Baseline Models for Post-Disaster Health Information Extraction from Indonesian Online News

Istiqomah, Nalar (Unknown)
Novika, Fanny (Unknown)



Article Info

Publish Date
09 Jul 2025

Abstract

Natural disasters often have significant impacts on public health, yet systematic monitoring of post-disaster diseases in Indonesia remains limited. This study compares the performance of two Named Entity Recognition (NER) models in extracting health impacts, affected locations, and disaster types from Indonesian-language online news articles. The first model is IndoBERT, fine-tuned using 1,137 manually validated disaster-related news articles. The second comprises baseline models from the IndoLEM benchmark, namely mBERT and XLM-RoBERTa, without domain-specific training. Evaluation results show that IndoBERT outperforms the baseline models, achieving 90.00% accuracy and an F1-score of 88.26%, compared to mBERT (72.93%) and XLM-R (76.44%). Further analysis of the extracted entities reveals spatial and temporal disease trends: floods in Java are consistently associated with diarrhea and skin diseases, while volcanic eruptions in eastern Indonesia are linked to respiratory infections and hypertension. These findings highlight the importance of selecting appropriate models to support data-driven public health monitoring systems in disaster-prone regions

Copyrights © 2025






Journal Info

Abbrev

cosie

Publisher

Subject

Computer Science & IT

Description

Artificial Intelligence Machine Learning Natural Language Processing Computer Vision Text Speech Text Mining Data mining Cryptography Data visualization Expert System Deep Learning Fuzzy Logic IoT and smart environments Neural Networks Pattern Recognition Image Processing Optimization Digital Signal ...