Y. Chalapathi, Rao
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Analysis of human emotions through speech using deep learning fusion technique for Industry 5.0 Anil Kumar, Chevella; Sagar Reddy, Vumanthala; Pravallika, Ambati; Y. Chalapathi, Rao; Syamala, Neelam
Bulletin of Electrical Engineering and Informatics Vol 14, No 1: February 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/eei.v14i1.8464

Abstract

Emotions are important for human well-being and social connections. This work focuses on the issue of effectively understanding emotions in human speech, specifically in the context of Industry 5.0. Traditional approaches and machine learning (ML) techniques for identifying emotions in speech are limited, such as the requirement for complicated feature extraction. Traditional methods yield recognition accuracies of no more than 90% because to the restricted extraction of temporal/sequence information. This paper suggests a ground-breaking fusion-based deep learning (DL) method to overcome these limitations. Specifically, one-dimensional (1D) and two-dimensional (2D) convolution neural network (CNN) can automatically extract significant characteristics and handle enormous datasets in real time. Furthermore, a fusion-based DL network, speech emotion recognition deep learning fusion network (SER_DLFNet), has been proposed, which combines CNN with long short-term memory (LSTM) to collect sequence information and increase recognition accuracy. The proposed model shows impressive results, with a test accuracy of 95.52% on the ryerson audio-visual database of emotional speech and song (RAVDESS) dataset. This research contributes to the advancement of more precise and efficient emotion identification algorithms for voice analysis, especially within the framework of Industry 5.0.