Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Computer Science (CO-SCIENCE)

Penerapan: Penerapan Metode SMOTE Untuk Mengatasi Imbalanced Data Pada Klasifikasi Ujaran Kebencian Ridwan, Ridwan; Heni Hermaliani, Eni; Ernawati, Muji
Computer Science (CO-SCIENCE) Vol. 4 No. 1 (2024): Januari 2024
Publisher : LPPM Universitas Bina Sarana Informatika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31294/coscience.v4i1.2990

Abstract

Hate speech is the spread of hatred towards individuals or groups on the basis of ethnicity, religion, race, and other characteristics that can lead to discrimination, violence, and social conflict. Unbalanced data can cause negative results in classification results. The Synthetic Minority Oversampling Technique (SMOTE) method is used to deal with unbalanced data. Feature extraction uses Bag of Words and TD-IDF, then the training data are oversampled using the SMOTE, SVM-SMOTE, Kmeans-SMOTE, and Borderline-SMOTE methods. This classification uses the Random Forest, Support Vector Machine, Logistic Regression, and Naive Bayes algorithms using Twitter data. The research results show that the application of the Borderline-SMOTE method to handle imbalanced data produces better performance than other SMOTE methods based on accuracy, recall,precision and F1-Score values with respective values of 84.09%, 85.25%, 84,55% and 81.16%. The Random Forest algorithm produces higher performance values than other algorithms.
Penerapan: Penerapan Metode SMOTE Untuk Mengatasi Imbalanced Data Pada Klasifikasi Ujaran Kebencian Ridwan, Ridwan; Heni Hermaliani, Eni; Ernawati, Muji
Computer Science (CO-SCIENCE) Vol. 4 No. 1 (2024): Januari 2024
Publisher : LPPM Universitas Bina Sarana Informatika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31294/coscience.v4i1.2990

Abstract

Hate speech is the spread of hatred towards individuals or groups on the basis of ethnicity, religion, race, and other characteristics that can lead to discrimination, violence, and social conflict. Unbalanced data can cause negative results in classification results. The Synthetic Minority Oversampling Technique (SMOTE) method is used to deal with unbalanced data. Feature extraction uses Bag of Words and TD-IDF, then the training data are oversampled using the SMOTE, SVM-SMOTE, Kmeans-SMOTE, and Borderline-SMOTE methods. This classification uses the Random Forest, Support Vector Machine, Logistic Regression, and Naive Bayes algorithms using Twitter data. The research results show that the application of the Borderline-SMOTE method to handle imbalanced data produces better performance than other SMOTE methods based on accuracy, recall,precision and F1-Score values with respective values of 84.09%, 85.25%, 84,55% and 81.16%. The Random Forest algorithm produces higher performance values than other algorithms.