Claim Missing Document
Check
Articles

Found 1 Documents
Search

Analysis and Evaluation of Qur’an Translation Topics Using Classical, Neural, and Transformer-Based Topic Modelling Kurnia, Akhmad Rinaldy; Anggai, Sajarwo; Handayani, Murni
Jurnal Ilmiah Multidisiplin Indonesia (JIM-ID) Vol. 5 No. 02 (2026): Jurnal Ilmiah Multidisplin Indonesia (JIM-ID), February 2026
Publisher : Sean Institute

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Topic modelling is an important approach for extracting latent thematic structures from text corpora, including religious texts that are characterized by dense semantics and short documents. This study aims to compare the performance of several topic modelling methods Latent Dirichlet Allocation (LDA), Biterm Topic Model (BTM), Combined Topic Model (CombinedTM), and BERTopic in extracting topics from the Indonesian translation of the Qur’an. The dataset consists of 6,236 verses, with each verse treated as a single document. Topic quality is evaluated using two main metrics: coherence score (C_v) and topic diversity. The experimental results show that CombinedTM achieves the highest coherence score, with a maximum value of approximately 0.52 at K = 10 topics, followed by BTM, which demonstrates relatively high and stable coherence scores (around 0.50) across certain topic number variations. LDA yields the highest topic diversity, exceeding 0.90, but with lower coherence scores compared to the other models, indicating its limitations in preserving semantic coherence in short texts. Meanwhile, BERTopic exhibits consistently high topic diversity (0.85–0.88) across different numbers of topics, although its bag-of-words–based coherence scores do not always increase significantly. These findings highlight that the choice of topic modelling method should be aligned with the characteristics of the corpus and the objectives of thematic analysis, particularly in the context of short-form religious texts.