India is the country with the second highest number of diabetes cases in the world after Tiongkok with more than 77 millions cases in 2021. This study aims to analyze the risk of developing diabetes using the C4.5 algorithm based on public health data in India. The dataset used in this study was obtained from Kaggle under the title “Pima Indians Diabetes Dataset” which was updated in September 2025. Data processing was performed using the RapidMiner application with a data mining method based on the C4.5 algorithm to classify data based on variables such as age, number of pregnancies, body mass index (BMI), blood pressure, blood sugar levels, and insulin levels. The results of the study show that the C4.5 algorithm is capable of identifying diabetes risk patterns with a fairly good level of accuracy so that it can be used as a tool to assist in decision making for early detection of diabetes risk.
Copyrights © 2026