Khoirotul Aini, Yulistia
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Speech Emotion Recognition of Indonesian Movies by Using Convolutional Neural Network Santoso, Tri Budi; Khoirotul Aini, Yulistia; Dutono, Titon
JOIV : International Journal on Informatics Visualization Vol 9, No 6 (2025)
Publisher : Society of Visual Informatics

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.62527/joiv.9.6.3552

Abstract

Speech emotional recognition (SER) is one of the interesting research areas of human-computer interaction (HCI) systems. The objective of this study is to provide a basis for a basic model of Indonesian-language speech emotional recognition, which is achieved by utilizing dialogues from an Indonesian-language movie.  The process began by developing a dataset from film dialogue and grouping it into four emotion classes: angry, happy, neutral, and sad. The development of the datasheet produced 5049 data points consisting of 1202 for anger, 1228 for happy, 2075 for neutral, and 899 for sad. This study uses the Mel-frequency cepstral coefficients (MFCC) method to analyze audio features from Indonesian-language movies and employs a Convolutional Neural Network (CNN) for clustering. The process began with MFCC feature extraction. During training, an accuracy of 85.85% was achieved, and during testing, 83.35%. Based on a series of tests carried out with various improvements to the previous process, a description of this system's behavior is obtained from a confusion matrix. Angry, happy, and sad expressions are easier to recognize than neutral expressions. The behavior of neutral expressions is flat in energy levels and other features. In the future, we hope it can be developed into a cross-corpus model and applied to speakers from various cultures.