Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Computing and Information System Journal

Effect Of Random Under Sampling and Random Over Sampling Method On Svm Performance Agil Dwi Saputra; Deni Arifianto; Reni Umilasari
Computing and Information System Journal Vol. 1 No. 2 (2025): Integration of Automation and Information Systems in Enhancing Organizational C
Publisher : IndoCompt Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Imbalanced data is a common challenge in sentiment analysis, as it can cause the classification model to be biased towards the majority class and ignore important information from the minority class. This study aims to evaluate the effect of resampling methods, namely Random Under Sampling (RUS), and Random Over Sampling (ROS), on the performance of the Support Vector Machine (SVM) algorithm in handling imbalanced sentiment data. Data were collected from social media X (Twitter) with the topic of naturalization of soccer players in Indonesia. The research process includes preprocessing, TF-IDF weighting, and model testing using K-Fold Cross Validation with K = 2, 5, and 10. Evaluation was carried out based on the F1-score matrix, recall, precision, and accuracy. The results show that the ROS method provides the best performance, especially at K = 10 with an F1-score value of 0.80, recall 0.78, precision 0.84, and accuracy 0.85. and RUS shows a lower performance improvement. These results show that selecting an appropriate resampling method can improve the performance of the classification model when faced with imbalanced data.
THE EFFECT OF IMPERFECTIVE DATA SAMPLING METHOD ON SUPPORT VECTOR MACHINE ACCURACY Erna Cholida, Ferdy Maulana; Arifianto, Deni; Umilasari, Reni
Computing and Information System Journal Vol. 1 No. 3 (2025): Data Science, UI/UX, and E-Government for Decision Making
Publisher : IndoCompt Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Sentiment analysis is used to understand the direction of public opinion, but problems arise due to the unbalanced distribution of sentiment data, where one class dominates. This imbalance causes classification models such as Support Vector Machine (SVM) to be biased towards the majority class, which results in decreased accuracy and generalizability of the model. This study aims to assess the effectiveness of two data balancing techniques, namely, SVM-SMOTE, and ADASYN, in improving SVM performance. The research data was taken from social media platform X (Twitter), and testing was conducted using the K-Fold Cross Validation method (K=2, 5, and 10) using evaluation metrics such as accuracy, precision, recall, and F1-score. The results show that without data balancing, the SVM model can only achieve an average accuracy of 76.34% and F1-score of 62.38%, which reflects the weakness in recognizing minority classes. The application of the two balancing methods successfully improved the model performance. ADASYN increased the F1-score to 67.94%, while SVM-SMOTE showed the most optimal results with 82.4% accuracy and 74.02% F1-score. These findings indicate that SVM-SMOTE is the most effective technique in handling data imbalance and improving sentiment classification accuracy equally.