Building of Informatics, Technology and Science
Vol 7 No 2 (2025): September 2025

Narasi Presiden Indonesia: Analisis Wacana Politik Menggunakan BERTopic dalam Mengungkap Pola Tematik Pidato Presiden

Uliyatunisa, Uliyatunisa (Unknown)
Tukiyat, Tukiyat (Unknown)
Waskita, Arya Adhyaksa (Unknown)
Handayani, Murni (Unknown)
Zain, Rafi Mahmud (Unknown)



Article Info

Publish Date
21 Sep 2025

Abstract

The speeches of the President of Indonesia play an important role as a means of political communication, policy delivery, and leadership image building in front of the public. However, the increasing volume of speeches presents new challenges in the manual analysis process, as it is time-consuming and prone to researcher subjectivity. This study offers a solution by using BERTopic, a transformer-based topic modelling method that utilises semantic representations from modern embedding models. The research data consists of transcripts of President Joko Widodo's official speeches obtained from the Cabinet Secretariat portal. To improve the quality of semantic representations, this study compares several Indonesian language embedding models, namely DistilBERT, NusaBERT, IndoE5, and SBERT. The analysis process was carried out through the stages of data preprocessing, embedding formation, dimension reduction, clustering, and model evaluation using topic coherence metrics. The objectives of this study were to reveal the themes contained in the President's speeches and to evaluate the effectiveness of embedding models in producing more coherent topics. The results show twenty main themes that consistently appear, including infrastructure development, economic policy, health and the pandemic, digital transformation, international diplomacy, sports, nationalism issues, and regional development. In terms of performance, SBERT provides the best results with a coherence value of UMass = -2.036 and NPMI = 0.082, indicating a positive semantic relationship. A UMass value close to zero indicates greater coherence of words within a topic, while an NPMI value above zero indicates that the connections between words are more easily understood by humans. This research contributes to the development of NLP-based political discourse studies in Indonesia, providing an empirical overview of the selection of appropriate embedding models in topic modelling and opening up opportunities for the integration of similar methods in public policy analysis.

Copyrights © 2025






Journal Info

Abbrev

bits

Publisher

Subject

Computer Science & IT

Description

Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. ...