JITK (Jurnal Ilmu Pengetahuan dan Komputer)
Vol. 11 No. 3 (2026): JITK Issue February 2026

EVALUATING CLUSTERING METHODS FOR SEMANTIC REPRESENTATION OF DISASTER NEWS USING BERT EMBEDDINGS AND HBDSCAN

Ningrum, Ariska Fitriyana (Unknown)
Purwanto, Dannu (Unknown)
Sharkawy, Abdel Nasser (Unknown)



Article Info

Publish Date
11 Feb 2026

Abstract

Natural disasters that frequently occur in Indonesia demand a fast and accurate information monitoring and analysis system through online news sources. This study aims to identify topic patterns related to natural disasters in Indonesia using news articles from Detik.com through a semantic clustering approach. A total of 1,000 articles were collected, preprocessed, and represented using the Sentence-BERT (SBERT) model to capture contextual relationships between sentences. The vector representations were then clustered using three methods: K-Means, Agglomerative Hierarchical Clustering, and HDBSCAN. The performance of each method was evaluated using the Silhouette Score, Davies–Bouldin (DB) Index, and Calinski–Harabasz (CH) Index. The results show that HDBSCAN achieved the best performance with a Silhouette Score of 0.215, a DB Index of 1.557, and a CH Index of 18.102, outperforming Agglomerative (0.028, 3.945, 29.669) and K-Means (0.055, 3.678, 36.778). Moreover, the HDBSCAN model achieved the highest coherence score of 0.8669, indicating strong semantic consistency within clusters. Five coherent clusters emerged, representing major disaster themes: landslides, earthquakes, tornadoes, flash floods, and volcanic activity. The visualization of word clouds for each cluster reinforced the interpretation of these disaster topics. Overall, the combination of SBERT and HDBSCAN effectively groups news articles based on semantic similarity. These findings highlight the potential of Natural Language Processing (NLP) to enhance data-driven media monitoring, support early warning systems, and strengthen disaster communication and mitigation strategies in Indonesia

Copyrights © 2026






Journal Info

Abbrev

jitk

Publisher

Subject

Computer Science & IT

Description

Kegiatan menonton film merupakan salah satu cara sederhana untuk menghibur diri dari rasa gundah gulana ataupun melepas rasa lelah setelah melakukan aktivitas sehari-hari. Akan tetapi, karena berbagai alasan terkadang seseorang tidak ada waktu untuk menonton film di bioskop. Dengan bantuan media ...