Indonesian Journal of Electrical Engineering and Computer Science
Vol 40, No 3: December 2025

Sentiment analysis in Arabic and dialects: a review utilizing a corpus-based approach

Hussein Ali, Abbas (Unknown)
Barişçi, Necaattin (Unknown)



Article Info

Publish Date
01 Dec 2025

Abstract

Arabic is one of the most morphologically complex languages, and its numerous dialects render identifying sentiment in digital communication a challenging task. In this study, we conduct a systematic literature review (SLR) to investigate the sentiment analysis (SA) techniques used on modern standard Arabic (MSA) and several Arabic dialects (AD) between 2020 and 2024. A corpus-based analysis of 71 articles indicated that machine learning (ML) and deep learning (DL) algorithms were the dominant methods used. Overall, the most frequently studied dialects are those from Saudi Arabia, Morocco, and to a lesser extent, Algeria, among various algorithms used for text classification, including support vector machines (SVM) and convolutional neural networks (CNN). These techniques emerged as some of the most effective strategies employed for sentiment classification. While new contemporary word embeddings, such as Word2Vec, are gaining traction in the field, traditional feature extraction methods, like term frequency-inverse document frequency (TF-IDF), continue to outperform them. The study highlights the importance of additional labeled datasets and tailored models in navigating the linguistically rich world of AD. Additionally, the results highlight the need for dialect-specific adaptations to improve SA outcomes, and further investigation is needed by leveraging advanced DL methodologies, as well as improved data resources, to address these issues.

Copyrights © 2025