The COVID-19 virus became a virus that was deadly and shocked the world. One of the consequences caused by the COVID-19 virus is a respiratory infection. The solution put forward for this problem is with a prediction of the COVID-19 virus infection. This prediction was made based on the classification of chest X-ray data. One challenging issue in this field is the imbalance on the amount of data between infected chest X-rays and uninfected chest X-rays. The result of imbalanced data is data classification that ignores classes with fewer data. To overcome this problem, the data sampling technique becomes a mechanism to make the data balanced. For this reason, several data sampling techniques will be evaluated in this study. Data sampling techniques include Random Undersampling (RUS), Random Oversampling (ROS), Combination of Over-Undersampling (COUS), Synthetic Minority Over-sampling Technique (SMOTE), and Tomek Link (T-Link). This study also uses the Support Vector Machines (SVM) data classification, because it has high accuracy. Furthermore, the evaluation is carried out by selecting the highest accuracy and Area Under Curve (AUC). The best sampling technique found was SMOTE with an accuracy value of 99% and an AUC value of 99.32%. The SMOTE technique is the best data sampling technique for the classification of COVID-19 chest x-ray data.
Copyrights © 2021