This study aims to evaluate the performance of sentiment classification on social media data related to the Palestine–Israel conflict, with a particular emphasis on the role of labeling quality and data distribution. The proposed approach combines TF-IDF text representation with lexicon-based labeling using InSet, along with two classification algorithms: Support Vector Machine (SVM) and Random Forest. The dataset was collected from the social media platform X and consisted of 2,831 Indonesian-language tweets that had undergone preprocessing. The results indicate that the sentiment distribution was dominated by the negative class (39.35%), followed by neutral (38.43%) and positive (22.21%) classes, indicating the presence of class imbalance. The labeling validity evaluation produced a Cohen’s Kappa value of 0.0175, indicating a low level of agreement between automatic labeling and manual annotation. The SVM model achieved an accuracy of 0.69 and a weighted F1-score of 0.68. However, both models demonstrated poor performance on the positive class as the minority class. These findings suggest that the limitations in model performance are not solely caused by the classification algorithms themselves, but are also significantly influenced by labeling quality and data distribution characteristics. This study contributes by emphasizing the importance of comprehensive evaluation throughout the sentiment analysis pipeline, particularly when dealing with complex and uncontrolled data sources such as social media.
Copyrights © 2026