Irwan Budiman
Department of Computer Science, Lambung Mangkurat University, Banjarbaru, Indonesia

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Classification of Eyewitness Social Media Messages for Natural Disaster Monitoring using BERT Variants Muhammad Bashir Hanafi; Mohammad Reza Faisal; Friska Abadi; Irwan Budiman; Setyo Wahyu Saputro; Njideka Nkemdilim Mbeledogu
Jurnal Teknik Informatika (Jutif) Vol. 7 No. 3 (2026): JUTIF Volume 7, Number 3, June 2026
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2026.7.3.5317

Abstract

The rapid growth of disaster-related social media data demands effective monitoring. However, its real-time source presents challenges due to large volumes of unstructured and noisy data. This study aims to improve effective monitoring with BERT variants to classify eyewitness reports on Twitter/X. Earlier studies have applied machine-learning and deep-learning models to automate the monitoring of eyewitness messages on social media, but these models still have shortcomings. Traditional machine-learning models rely on handcrafted and frequency-based features, limiting their ability to capture contextual semantics. Deep-learning models offer improved performance but still face challenges in modeling long-range dependencies and handling high-volume social media streams. This issue is pronounced in social media streams. This study employs transformer-based models using several BERT variants (BERT, RoBERTa, DistilBERT, ELECTRA, and ALBERT). Each model is pre-trained with the Masked Language Modeling (MLM) objective, and batch-size optimization is applied to boost performance. Experimental results indicate that a batch size of 16 consistently yields the best performance, with the standard BERT model achieving the highest macro-F1 score of 0.762. By disaster type, macro-F1 scores reach 0.744 for hurricane, 0.793 for flood, 0.756 for earthquake, and 0.750 for wildfire. BERT (16) outperforms the other BERT variants and twelve baseline models from prior research. Unlike previous approaches, this study leverages pre-trained Masked Language Models to optimize classification on disaster-related datasets. The findings contribute to the development of transformer-based architectures for text classification in real-time disaster informatics, leading to more accurate situational awareness and reduced delays in emergency decision-making.