This study aims to examine the application of multimodal approaches in implicit sentiment detection within the tourism sector to support data-driven digital development strategies. This review identifies prevailing trends, methodologies, datasets, and scientific novelties in multimodal sentiment analysis capable of capturing hidden emotions, such as sarcasm and ambiguity, in tourist reviews. Using a systematic literature review approach, ten core studies published between 2020 and 2025 were analyzed to identify prevailing research trends, dominant methodological frameworks, commonly used datasets, and emerging scientific contributions. Results demonstrate that multimodal deep learning models—particularly those employing attention-based fusion and contrastive learning—consistently outperform unimodal approaches in recognizing nuanced tourist emotions that are not explicitly stated in text. Despite these advances, the review reveals a significant gap in tourism-specific and Indonesian-context studies, as well as an overreliance on general-purpose social media datasets. This review provides a conceptual and methodological foundation for implementing multimodal implicit sentiment analysis in tourism decision-making systems, enabling destination managers and policymakers to develop early warning mechanisms for tourist dissatisfaction, enhance destination quality assessment, and support more targeted and sustainable tourism development strategies.
Copyrights © 2026