Abstract”” Diabetes is a disease that can cause death. Diabetes can cause heart failure, chronic kidney disease, glaucoma that attacks the eyes and several other diseases. WHO data states that there were more than 2 million deaths due to diabetes in 2019. Data from the International Diabetes Federation shows that around 537 adults are recorded as living with diabetes. This condition must be treated immediately, considering that diabetes is one of the most deadly non-communicable diseases in the world. Patient registration is mostly done in hospitals. A lot of data will only become digital waste if it does not have more benefits. In 2020 Diabetes and Hospital in Sylhet donated patient data for further research. This data contains 520 patient records with 17 attributes that have been validated by specialist doctors. Early stage diabetes risk prediction data is released by the uci repository as public data and can be used for research testing. Research using this dataset has been widely carried out with the previous best accuracy level of 95.96%. In previous studies, all attributes were used in the classification process. The number of irrelevant attributes can affect the performance of the classification algorithm. This study uses the information gain ratio for feature selection of the early stage diabetes risk prediction dataset. The C45 algorithm is used for classification, evaluation using confusion matrix and validation using 10 folds cross validation. The results of this study improve the performance of C45 so that it obtains an accuracy level of 96.15%. This study also produces a decision tree for diabetes..
Copyrights © 2024