Format : Jurnal Imiah Teknik Informatika
Vol 13, No 1 (2024)

Komparasi Algoritma Topic Modelling LDA VS LSA Pada Berita Detikcom

Al Izzi, Ahmad Kemal (Unknown)
Pratama, Rakadian Audiga (Unknown)



Article Info

Publish Date
07 Nov 2024

Abstract

This research focuses on the process of applying Topic Modeling by comparing the Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) models on news tweet data taken from the Detikcom account. The process begins by crawling data over a one year period, starting from December 9, 2022 to December 9, 2023, resulting in 958 rows of data. Data pre-processing includes steps such as case folding, tokenization, stopwords removal, and stemming. After pre-processing, a bag of words process is carried out to calculate the frequency of word occurrences in each document. The number of word occurrence frequencies is used as a reference in creating LSA and LDA models. Each model has 8 topics, 10 iterations, and 42 random states. Topic production is carried out based on keywords that appear in the modeling results. Evaluation of the two models is carried out by measuring topic coherence or topic coherence using the c_v value. The LSA model shows a coherence value of 0.5, while the LDA model has a coherence value of 0.45. The evaluation results show that in this case, the LSA model has better performance than the LDA model based on the topic coherence value. As a suggestion for further research, researchers are expected to consider the use of other cases for topic modeling and other exploration models in Topic Modeling such as OCTIS. This can expand understanding of the performance of the Topic Modeling algorithm on X news data.

Copyrights © 2024






Journal Info

Abbrev

format

Publisher

Subject

Computer Science & IT

Description

Format : Jurnal Ilmiah Teknik Informatika merupakan jurnal peer-review yang berasal dari hasil-hasil penelitian dan kajian ilmiah di bidang Ilmu Komputer khususnya Informatika. Cakupan naskah artikel yang dapat dipublikasikan difokukuskan pada bidang berikut (namun tidak terbatas): ICT, Rekayasa ...