Journal of Electronics, Electromedical Engineering, and Medical Informatics
Vol 6 No 4 (2024): October

Application Of SMOTE To Address Class Imbalance In Diabetes Disease Classification Utilizing C5.0, Random Forest, And SVM

M. Khairul Rezki (Unknown)
Mazdadi, Muhammad Itqan (Unknown)
Indriani, Fatma (Unknown)
Muliadi, Muliadi (Unknown)
Saragih, Triando Hamonangan (Unknown)
Athavale, Vijay Annant (Unknown)



Article Info

Publish Date
06 Aug 2024

Abstract

The implementation of SMOTE to tackle class imbalance in classification frequently results in suboptimal outcomes, owing to the intricacy of the dataset and the multitude of attributes at play. Consequently, alternative classification models were explored through experimentation to gauge their precision. This research aims to compare the precision of C5.0, Random Forest, and SVM classification models both with and without SMOTE. The methodology encompasses dataset selection, an overview of classification algorithms (C5.0, Random Forest, SVM), SMOTE technique, validation via split validation, preprocessing involving min-max normalization, and execution evaluation utilizing confusion matrices and AUC analysis. The dataset was sourced by Kaggle, specifically to rectify class imbalance in a diabetes dataset using SMOTE, consisting of 768 instances, with 268 samples for diabetic cases and 500 samples for non-diabetic cases. Prior to SMOTE application, the classification precision for C5.0, Random Forest, and SVM were 0.714, 0.733, and 0.746 respectively, with corresponding AUC values of 0.745, 0.824, and 0.799. Post-SMOTE, the precision depicts for the same techniques were 0.603, 0.727, and 0.727, with AUC values of 0.734, 0.831, and 0.794 respectively. It can be inferred that there's minimal impact post-SMOTE across the three classification models due to potential overfitting on the dataset, leading to excessive reliance on synthesized data for minority classes, resulting in diminished model execution, precision, and AUC scores.

Copyrights © 2024






Journal Info

Abbrev

jeeemi

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Engineering

Description

The Journal of Electronics, Electromedical Engineering, and Medical Informatics (JEEEMI) is a peer-reviewed open-access journal. The journal invites scientists and engineers throughout the world to exchange and disseminate theoretical and practice-oriented topics which covers three (3) majors areas ...