Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : International Journal of Electrical and Computer Engineering

Exploring topic modelling: a comparative analysis of traditional and transformer-based approaches with emphasis on coherence and diversity Riaz, Ayesha; Abdulkader, Omar; Ikram, Muhammad Jawad; Jan, Sadaqat
International Journal of Electrical and Computer Engineering (IJECE) Vol 15, No 2: April 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijece.v15i2.pp1933-1948

Abstract

Topic modeling (TM) is an unsupervised technique used to recognize hidden or abstract topics in large corpora, extracting meaningful patterns of words (semantics). This paper explores TM within data mining (DM), focusing on challenges and advancements in extracting insights from datasets, especially from social media platforms (SMPs). Traditional techniques like latent Dirichlet allocation (LDA), alongside newer methodologies such as bidirectional encoder representations from transformers (BERT), generative pre-trained transformers (GPT), and extra long-term memory networks (XLNet) are examined. This paper highlights the limitations of LDA, prompting the adoption of embedding-based models like BERT and GPT, rooted in transformer architecture, offering enhanced context-awareness and semantic understanding. The paper emphasizes leveraging pre-trained transformer-based language models to generate document embedding, refining TM and improving accuracy. Notably, integrating BERT with XLNet summaries emerges as a promising approach. By synthesizing insights, the paper aims to inform researchers on optimizing TM techniques, potentially shifting how insights are extracted from textual data.