Chennupati Sumanth Kumar
GITAM Deemed to be University

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Acoustic and visual geometry descriptor for multi-modal emotion recognition fromvideos Kummari Ramyasree; Chennupati Sumanth Kumar
Indonesian Journal of Electrical Engineering and Computer Science Vol 33, No 2: February 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijeecs.v33.i2.pp960-970

Abstract

Recognizing human emotions simultaneously from multiple data modalities (e.g., face, and speech) has drawn significant research interest, and numerous research contributions have been investigated in the affective computing community. However, most methods concentrate less on facial alignment and keyframe selection for audio-visual input. Hence, this paper proposed a new audio-visual descriptor, mainly concentrating on describing the emotion through only a few frames. For this purpose, we proposed a new self-similarity distance matrix (SSDM), which computes the spatial, and temporal distances through landmark points on the facial image. The audio signal is described through an asset of composite features, including statistical features, spectral features, formant frequencies, and energies. A support vector machine (SVM) algorithm is employed to classify both models, and the final results are fused to predict the emotion. Surrey audio-visual expressed emotion (SAVEE) and Ryerson multimedia research lab (RML) datasets are utilized for experimental validation, and the proposed method has shown significant improvement from the state of art methods.