Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Jurnal Teknik Informatika (JUTIF)

Hybrid Model for Speech Emotion Recognition using Mel-Frequency Cepstral Coefficients and Machine Learning Algorithms Nurdiawan, Odi; Ade Kurnia, Dian; Sudrajat, Dadang; Pratama, Irfan
Jurnal Teknik Informatika (Jutif) Vol. 6 No. 5 (2025): JUTIF Volume 6, Number 5, Oktober 2025
Publisher : Informatika, Universitas Jenderal Soedirman

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.52436/1.jutif.2025.6.5.5143

Abstract

Speech Emotion Recognition (SER) is a subfield of affective computing that focuses on identifying human emotions through voice signals. Accurate emotion classification is essential for developing intelligent systems capable of interacting naturally with users. However, challenges such as background noise, overlapping emotional features, and speaker variability often reduce model performance. This study aims to develop a lightweight hybrid SER model by combining Mel-Frequency Cepstral Coefficients (MFCC) as feature representations with three machine learning algorithms: Support Vector Machine (SVM), Decision Tree (DT), and K-Nearest Neighbors (KNN). The methodology involves audio data preprocessing, MFCC-based feature extraction, and classification using the selected algorithms. The RAVDESS dataset, consisting of 1,440 English-language audio samples across four emotions (happy, angry, sad, neutral), was used with an 80/20 train-test split to ensure class balance.. Experimental results show that the KNN model achieved the highest performance, with an accuracy of 78.26%, precision of 85.09%, recall of 78.26%, and F1-score of 77.06%. The Decision Tree model produced comparable results, while the SVM model performed poorly across all metrics. These findings demonstrate that the proposed hybrid approach is effective for recognizing emotions in speech and offers a computationally efficient alternative to deep learning models. The integration of MFCC features with multiple machine learning classifiers provides a robust framework for real-time emotion recognition applications, especially in environments with limited computing resources.