Indonesian Journal of Electrical Engineering and Computer Science
Vol 39, No 3: September 2025

UniMSE: a unified approach for multimodal sentiment analysis leveraging the CMU-MOSI Dataset

Basu, Miriyala Trinath (Unknown)
Saha, Mainak (Unknown)
Gupta, Arpita (Unknown)
Hazra, Sumit (Unknown)
Fatima, Shahin (Unknown)
Sumalakshmi, Chundakath House (Unknown)
Shanvi, Nallagopu (Unknown)
Reddy, Nyalapatla Anush (Unknown)
Abhinav, Nallamalli Venkat (Unknown)
Hemanth, Koganti (Unknown)



Article Info

Publish Date
01 Sep 2025

Abstract

This paper explores multimodal sentiment analysis using the CMU-MOSI dataset to enhance emotion detection through a unified approach called UniMSE. Traditional sentiment analysis, often reliant on single modalities such as text, faces limitations in capturing complex emotional nuances. UniMSE overcomes these challenges by integrating text, audio, and visual cues, significantly improving sentiment classification accuracy. The study reviews key datasets and compares leading models, showcasing the strengths of multimodal approaches. UniMSE leverages task formalization, pre-trained modality fusion, and multimodal contrastive learning, achieving superior performance on widely used benchmarks like MOSI and MOSEI. Additionally, the paper addresses the difficulties in effectively fusing diverse modalities and interpreting non-verbal signals, including sarcasm and tone. Future research directions are proposed to further advance multimodal sentiment analysis, with potential applications in areas like social media monitoring and mental health assessment. This work highlights UniMSE's contribution to developing more empathetic artificial intelligence (AI) systems capable of understanding complex emotional expressions.

Copyrights © 2025