The increasing number of diabetes mellitus sufferers makes early identification a crucial step to reduce the risk of complications. Utilizing health data through a data mining-based approach offers opportunities to assist in more systematic disease analysis and classification. This research focuses on the application of the Decision Tree C4.5 algorithm to classify diabetes using the Pima Indians Diabetes dataset. The data used consisted of 768 female patients with eight medical attributes related to health conditions, such as glucose levels, body mass index, blood pressure, age, and number of pregnancies. The research process included data processing, model development, and evaluation of the classification results using the CRISP-DM workflow. The results showed that the classification model built using the C4.5 algorithm achieved an accuracy of 77.04%. The resulting decision tree structure demonstrated that the glucose level attribute played a dominant role in determining the classification results. These findings demonstrate that the Decision Tree C4.5 algorithm can be utilized as a fairly effective approach to assist in the initial classification of diabetes based on medical data.
Copyrights © 2026