Shankar, Thejaswini
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Multimodal facial expression recognition using residual mogrifier long short-term memory Rajanna, Mamatha Kariyappa; Shankar, Thejaswini; Narasimhamurthy, Rashmi; Annivedu Lakshmanan, Nandhini; S. Ananthapadmanabharao, Hariprasad
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 15, No 2: April 2026
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v15.i2.pp1566-1577

Abstract

Multimodal facial expression recognition aims to improve emotion analysis by integrating visual, audio, and textual cues to achieve accuracy and robustness. However, effectively recognizing facial expressions across video, text, and audio presents challenges due to inconsistencies in how emotions are expressed among these modalities. To overcome this issue, this research proposes a residual mogrifier long short-term memory (RMLSTM) model to enhance robustness in multimodal facial expression recognition. By integrating residual connections into the long short-term memory (LSTM), the model improves its ability to capture complex dependencies among various modalities, including video, text, and audio. The residual connection overcomes the vanishing gradient problem and ensures stable training with better gradient flow in deeper networks. The mogrifier mechanism refines the input features dynamically, enhancing feature interaction and alignment across modalities. The RMLSTM achieves 99.57% and 97.83% accuracy on the SAVEE and YouTube datasets, respectively, outperforming both the mel-frequency cepstral coefficients time-domain feature with iterative dilated convolutional neural network (MFCCT-1DCNN) and attention-based multi-modal popularity prediction model of short-form videos (AMPS).