Diabetes mellitus is a widespread chronic disease that affects populations worldwide and is often only identified after complications have occurred. Machine learning techniques can be applied to estimate the likelihood of diabetes based on clinical patient data. This research focuses on developing a predictive model using the Extreme Gradient Boosting (XGBoost) algorithm, along with the application of Explainable Artificial Intelligence (XAI) to interpret the model’s outcomes. The dataset used in this study consists of 768 patient records with eight clinical attributes, namely pregnancies, glucose level, blood pressure, skin thickness, insulin level, body mass index (BMI), diabetes pedigree function, and age. The research process includes data preprocessing, exploratory data analysis, model development, performance evaluation, and interpretation using SHAP. The findings indicate that the XGBoost model achieves high predictive performance and is capable of identifying key factors associated with diabetes risk. Based on SHAP interpretation, glucose level, BMI, age, and insulin are the most influential variables in the prediction process. The integration of machine learning and explainable AI improves model interpretability while maintaining reliable prediction performance.
Copyrights © 2026