Diabetes prediction plays a vital role in healthcare, enabling early diagnosis and timely interventions to mitigate the risks associated with the disease. This study investigates the application of advanced machine learning architectures to predict diabetes using the Pima Indians Diabetes Dataset, a widely used benchmark for medical diagnostics. Five models: Deep Neural Network (DNN), Convolutional Neural Network (CNN) with Attention, LSTM with Residual Connections, Bidirectional LSTM (BiLSTM) with Attention, and GRU with Dense Layers were developed and evaluated on multiple performance metrics, including accuracy, precision, recall, F1 score, and ROC AUC. A stratified five-fold cross-validation strategy was employed to ensure robustness, while SHAP analysis was conducted to enhance interpretability. Among the models, the GRU with Dense Layers achieved superior performance, recording the highest accuracy (76.17%), F1 score (69.85%), and ROC AUC (83.52%). SHAP analysis revealed Glucose as the most influential feature, with significant interactions identified between Glucose and Pregnancies, aligning with established medical insights. Statistical analysis confirmed the reliability of the results, with all metrics demonstrating statistically significant improvements over a baseline of random chance (p < 0.05). These findings underscore the efficacy of GRU-based models in capturing complex patterns in medical data while maintaining computational efficiency. Future work will explore hybrid architectures and larger datasets to enhance generalizability and real-world applicability, contributing to more effective decision-making in healthcare.
Copyrights © 2024