Journal of Technology and Informatics (JoTI)
Vol. 5 No. 2 (2024): Vol. 5 No.2 (2024)

Topic Modeling for Evolving Textual Data Using LDA, HDP, NMF, BERTOPIC, and DTM With a Focus on Research Papers

Pavithra (Unknown)
Savitha (Unknown)



Article Info

Publish Date
29 Apr 2024

Abstract

As the volume of academic literature continues to burgeon, the necessity for advanced tools to decipher evolving research trends becomes increasingly apparent. This study delves into the utilization of topic modeling techniques—specifically Latent Dirichlet Allocation (LDA), Hierarchical Dirichlet Process (HDP), Non-negative Matrix Factorization (NMF), BERTopic, and Dynamic Topic Modeling (DTM)—applied to a dynamic corpus of research papers. Our research endeavors to confront the challenges posed by capturing temporal dynamics, evolving terminology, and interdisciplinary themes within academic literature. Through a comprehensive comparative investigation of these models, we assess their efficacy in extracting and tracking research topics over time. While DTM exhibited the highest term topic probability, its inclusion of non-meaningful words proved to be a hindrance to its suitability. Conversely, NMF, HDP, LDA, and BERTopic demonstrated comparable performance in topic extraction. Surprisingly, DTM emerged as the most effective model in our research, showcasing its prowess in navigating the intricacies of evolving research trends.

Copyrights © 2024






Journal Info

Abbrev

joti

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Engineering Mechanical Engineering

Description

1. Teknologi Informasi : Rekayasaperangkat lunak, Pengetahuan data maining, Mobile Computing, Parallel/Distributed Computing, Kecerdasan Buatan, Tata Kelola dan Manajemen Sistem Informasi, User Interface/ User Experience, Process Management, IT Security, IS Adoption and Evaluation. 2. Sistem ...