This study addresses the growing global burden of diabetes by evaluating whether ensemble-based machine learning models can support reliable and cost-efficient early risk prediction. Moving beyond accuracy-centered evaluation, the study integrates cost-sensitive threshold optimization and probability calibration to enhance clinical relevance. Random Forest and XGBoost are evaluated using two datasets with contrasting population characteristics. Model performance is examined in terms of discriminative ability, calibration quality, and total misclassification cost. The results indicate that while XGBoost remains competitive on small-scale datasets, Random Forest provides more stable calibration and more consistent cost efficiency. These findings suggest that cost-sensitive and calibrated ensemble approaches have the potential to support more rational and economically efficient diabetes screening policies.
Copyrights © 2025