The development of music has produced many works in the form of data, especially lyrical data, which provide insight into the semantic structure of music. This study explores latent thematic patterns in the indie lyric dataset from Kaggle by applying Latent Dirichlet Allocation (LDA), which is the first LDA study of indie music lyrics in the Indonesian context with the interpretation of love, emotional needs, romance, and inner conflict. The CRISP-DM (Cross-Industry Standard Process for Data Mining) methodology can be effectively applied to unstructured data, opening up opportunities for better music classification. The methodological stages include business and data understanding, data preparation, modelling, evaluation, and dissemination. In the early stages, the Kaggle dataset implemented Natural Language Processing, which was done with case folding, punctuation removal, stopword removal, stemming, and tokenization. The LDA model is trained by identifying five topics with different interpretations. Visualization in WordClouds, with topic distribution on datasets and title-based topic mapping. This model yielded a coherence value of 0.3044, which indicates limited semantic consistency, which means the words in the topic have a reasonably good relationship, but there is still potential for refinement in subsequent studies. The limitations of this study include the limited size of the dataset, with only 347 rows and slight variation in interpretation. For future research, it is recommended to use larger datasets and more diverse interpretations and apply more machine learning models.