Rafi, Muhammad Avilla
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Data Balancing Techniques Using the PCA-KMeans and ADASYN for Possible Stroke Disease Cases Ungkawa, Uung; Rafi, Muhammad Avilla
JOIN (Jurnal Online Informatika) Vol 9 No 1 (2024)
Publisher : Department of Informatics, UIN Sunan Gunung Djati Bandung

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15575/join.v9i1.1293

Abstract

Imbalanced data happens when the distribution of classes is not equal between positive and negative classes. In healthcare, the majority class typically consists of healthy patient data, while the minority class contains sick patient data. This condition can cause the minority class prediction to be wrong because the model tends to predict the majority class. In this study, we use a deep neural network algorithm with focal loss that can deal with class imbalance during training. To balance the data, we use the PCA-KMeans combination model to shrink the dataset and the ADASYN model to give the minority class more samples than it needs. In this study, the research problem is how well the two techniques can improve model performance, especially in minority case classification. The mild model is the best without data balancing, resulting in an accuracy value of 84%. The class 0 F1-score has a value of 86%, whereas the class 1 F1-score has a value of 82%. The moderate model is the best model in the case study of PCA-KMeans balancing data, resulting in an accuracy value of 89%; the class 0 F1-score is 91%; and the class 1 F1-score is 85%. The extreme model is the best model in the ADASYN data balancing case study, resulting in an accuracy value of 95%; the value in class 0 gets a F1-score of 96%, while the value in class 1 gets a F1-score of 96%. Of the three test models, the best model is obtained using ADASYN extreme data balancing with an accuracy value of 95%, the value in class 0 with a F1- score of 93%.