TEKNIK INFORMATIKA
Vol. 18 No. 2: JURNAL TEKNIK INFORMATIKA

Uncovering Hidden Themes in Indie Music: Crisp-Dm Guided LDA Topic Modeling on a Kaggle-Based Lyric Generation Dataset

Thoyyibah T (Unknown)
Yan Mitha Djaksana (Unknown)



Article Info

Publish Date
30 Oct 2025

Abstract

The development of music has produced many works in the form of data, especially lyrical data, which provide insight into the semantic structure of music. This study explores latent thematic patterns in the indie lyric dataset from Kaggle by applying Latent Dirichlet Allocation (LDA), which is the first LDA study of indie music lyrics in the Indonesian context with the interpretation of love, emotional needs, romance, and inner conflict. The CRISP-DM (Cross-Industry Standard Process for Data Mining) methodology can be effectively applied to unstructured data, opening up opportunities for better music classification. The methodological stages include business and data understanding, data preparation, modelling, evaluation, and dissemination. In the early stages, the Kaggle dataset implemented Natural Language Processing, which was done with case folding, punctuation removal, stopword removal, stemming, and tokenization. The LDA model is trained by identifying five topics with different interpretations. Visualization in WordClouds, with topic distribution on datasets and title-based topic mapping. This model yielded a coherence value of 0.3044, which indicates limited semantic consistency, which means the words in the topic have a reasonably good relationship, but there is still potential for refinement in subsequent studies. The limitations of this study include the limited size of the dataset, with only 347 rows and slight variation in interpretation. For future research, it is recommended to use larger datasets and more diverse interpretations and apply more machine learning models.

Copyrights © 2025






Journal Info

Abbrev

ti

Publisher

Subject

Computer Science & IT

Description

Jurnal Teknik Informatika merupakan wadah bagi insan peneliti, dosen, praktisi, mahasiswa dan masyarakat ilmiah lainnya untuk mempublikasikan artikel hasil penelitian, rekayasa dan kajian di bidang Teknologi Informasi. Jurnal Teknik Informatika diterbitkan 2 (dua) kali dalam ...