Claim Missing Document
Check
Articles

Found 1 Documents
Search

Advancing Indonesian Audio Emotion Classification: A Comparative Study Using IndoWaveSentiment Majiid, Muhammad Rizki Nur; Setiawan, Karli Eka; Pamungkas, Prayoga Yudha; Annas, Taufiq; Setiawan, Nicholas Lorenzo
Engineering, MAthematics and Computer Science Journal (EMACS) Vol. 7 No. 2 (2025): EMACS
Publisher : Bina Nusantara University

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.21512/emacsjournal.v7i2.13415

Abstract

This study addresses the critical gap in Indonesian Speech Emotion Recognition (SER) by evaluating machine learning models on the IndoWaveSentiment dataset, a novel corpus of 300 high-fidelity recordings capturing five emotions (neutral, happy, surprised, disgusted, disappointed) from native speakers. The research aims to identify optimal classification techniques and acoustic features for Indonesian SER, given the language’s unique linguistic characteristics and the scarcity of annotated resources. Six models, Logistic Regression, KNN, Gradient Boosting, Random Forest, Naive Bayes, and SVC, were trained on 45 acoustic features, including spectral contrast, MFCCs, and zero crossing rate, extracted using Librosa. Results demonstrated Random Forest as the top performer (90% accuracy), followed by Gradient Boosting (85%) and Logistic Regression (75%), with spectral contrast (contrast2, contrast7) and MFCC1 emerging as the most discriminative features. The findings highlight the efficacy of ensemble methods in capturing nuanced emotional cues in Indonesian speech, outperforming prior studies on locally sourced datasets. Practical implications include applications in customer service analytics and mental health tools, though limitations such as the dataset’s-controlled conditions and fixed sentence structure necessitate caution in real-world deployment. Future work should expand the dataset to include regional dialects, spontaneous speech, and hybrid architectures like CNN-LSTMs. This study establishes foundational benchmarks for Indonesian SER, advocating for culturally informed models to enhance human-computer interaction in underrepresented linguistic contexts.