Journal of Information Systems and Informatics
Vol 5 No 2 (2023): Journal of Information Systems and Informatics

Classification of Explicit Songs Based on Lyrics Using Random Forest Algorithm

Luh Kade Devi Dwiyani (Universitas Udayana)
I Made Agus Dwi Suarjaya (Udayana University)
Ni Kadek Dwi Rusjayanthi (Udayana University)



Article Info

Publish Date
16 May 2023

Abstract

This study focuses on the potential negative impact of explicit songs on children and adolescents. Although an explicit song labeling program is currently in place, its coverage is limited to songs released by artists affiliated with the Recording Industry Association of America (RIAA). Consequently, songs falling outside the program's scope remain inadequately labeled. To address this issue, a machine learning model was developed to effectively classify explicit songs and mitigate mislabeling challenges. A comprehensive dataset of song lyrics was collected using web scraping techniques for the purpose of constructing the classification model. The model was trained using the TF-IDF vectorization method and the random forest algorithm. A meticulous comparison of distribution parameters was conducted between the training and testing data sets to determine the optimal model. This superior model achieved a training-testing data distribution ratio of 90:10, with an impressive accuracy of 96.3%, precision of 99.3%, recall of 93.5%, and an f1-score of 96.3%. The classification results revealed that explicit songs accounted for 39.22% of the dataset, and the visual representation highlighted the fluctuating prevalence of explicit songs over time. Additionally, the hip-hop/rap genre exhibited the highest proportion of explicit songs, reaching a staggering 92%.

Copyrights © 2023






Journal Info

Abbrev

isi

Publisher

Subject

Computer Science & IT

Description

Journal-ISI is a scientific article journal that is the result of ideas, great and original thoughts about the latest research and technological developments covering the fields of information systems, information technology, informatics engineering, and computer science, and industrial engineering ...