Multi-label classification is essential to categorize data into multiple labels simultaneously. However, data imbalance poses a challenge, where some labels have much less representation, thus reducing the model performance. This study aims to propose a candidate-based sentiment analysis model on the 2024 Jakarta Presidential and Gubernatorial Election review. The SMOTE and ADASYN oversampling methods are applied to handle class imbalance. Both oversampling methods are compared with the Random Forest machine learning method. The experimental results show that. The experimental results show that in the classification of Presidential candidates, Random Forest achieves an accuracy of 0.947 with SMOTE and 0.948 with ADASYN. For sentiment labels, the accuracy of Random Forest remains high with a result of 0.989 for both SMOTE and ADASYN. In the classification of Jakarta Gubernatorial candidates, Random Forest + SMOTE produces an accuracy of 0.975, while with ADASYN it decreases slightly to 0.973. For sentiment labels, both SMOTE and ADASYN have the highest accuracy of 0.993. The application of SMOTE and ADASYN helps to improve the distribution of the minority class without decreasing the overall accuracy, as well as improving the stability in recognizing various multi-label classes in a balanced manner.
Copyrights © 2025