Diabetes, or high blood sugar, is a chronic condition that needs careful monitoring. If left untreated, it can lead to severe complications. This research aims to accurately diagnose diabetes, addressing the issue of class imbalance in the dataset, which can affect the model's classification accuracy. The goal is to improve diabetes classification accuracy using balancing methods, specifically the Synthetic Minority Over-sampling Technique (SMOTE) and Random Oversampling. These methods are applied to data from patients diagnosed with diabetes and those who do not have the disease.The initial step in the research involved addressing class imbalance by applying SMOTE and random oversampling to generate synthetic samples for the minority class. This was followed by data normalization using the min-max normalization method. Subsequently, the Random Forest Classifier was used to train the model for classification. The results demonstrate that this approach enhances the model's ability to identify diabetes cases, achieving an accuracy of 96%. This represents a 1% improvement over the accuracy of 95% reported in previous research.
Copyrights © 2024