The background of this study is predicting the health risk levels of hajj pilgrims, which is a significant challenge in improving healthcare services during the pilgrimage. This research contributes by systematically evaluating several machine learning techniques and applying SMOTE to balance the dataset, as opposed to previous studies that relied on single-model classification approaches. The data analyzed includes 5,000 health records of pilgrims, covering various attributes such as age, gender, medical history, and disease diagnosis, sourced from the Siskohat database of the Directorate General of Hajj and Umrah Management. The results show that Cross-Validation (Logistic Regression) achieved the highest accuracy (87.9%) after applying SMOTE, outperforming Decision Tree (86.4%) and K-NN (83.1%). These findings highlight that SMOTE significantly enhances recall, ensuring better identification of high-risk patients. The implications of these results contribute to hajj health management by providing a robust predictive framework that improves early risk detection and medical resource allocation, while also demonstrating a novel approach to handling imbalanced healthcare datasets.
Copyrights © 2025