Building of Informatics, Technology and Science
Vol 7 No 2 (2025): September 2025

Pengembangan Algoritma Convolutional Neural Network dalam Menganalisis Emosi Suara Menggunakan Mel-Spektogram

Zakka, Iqlima Sabila (Unknown)
Rakhman, Abdul (Unknown)
Lindawati, Lindawati (Unknown)



Article Info

Publish Date
02 Sep 2025

Abstract

Speech Emotion Recognition (SER) still faces challenges in accuracy, especially in distinguishing acoustically similar emotions. Conventional approaches such as MFCC (Mel Frequency Cepstral Coefficients) are often ineffective in capturing the emotional nuances of voice. To address this, this study aims to develop a Convolution Neural Network (CNN) model based on the Spec-ResNet architecture that uses Mel-Spectrogram as input to improve the system's ability to extract and recognize emotional signatures from speech signals. Another objective is to evaluate the performance of primary emotion classification in the RAVDESS dataset and measure model consistency through 5-fold cross-validation. The model used, Spec-ResNet, is an adaptation of the ResNet architecture equipped with residual learning to maximize the multi-stage feature extraction process. Experiments were conducted with the RAVDESS dataset containing 1,440 voice samples from six primary emotions: neutral, happy, sad, angry, afraid, and surprised. The test results showed a significant increase in accuracy, with a macro score reaching 92%, up from the MLP/SVM baseline of 83%. Neutral and happy emotions were classified very well (F1-scores of 93% and 90%), but emotions such as fear and surprise remained difficult to distinguish due to the similarity of their vocal patterns. Validation through 5-fold cross-validation yielded an average accuracy of 91.5% ± 0.8%. This study demonstrates the great potential of Mel-spectrograms in SER, while also underscoring the need for advanced approaches such as attention mechanisms to handle ambiguous emotions.

Copyrights © 2025






Journal Info

Abbrev

bits

Publisher

Subject

Computer Science & IT

Description

Building of Informatics, Technology and Science (BITS) is an open access media in publishing scientific articles that contain the results of research in information technology and computers. Paper that enters this journal will be checked for plagiarism and peer-rewiew first to maintain its quality. ...