Cerebrovascular Accident (stroke) is a critical health issue in Indonesia, often leading to high mortality and long-term disability. Early detection through machine learning has emerged as a promising approach to improve diagnosis and treatment outcomes. This study aims to compare the performance of two classification algorithms, Decision Tree and Extreme Gradient Boosting (XGBoost), in diagnosing stroke using the SMOTEENN (Synthetic Minority Over-sampling Technique and Edited Nearest Neighbor) technique to address data imbalance. The dataset used contains 5110 samples with 11 independent variables and one dependent variable (stroke status), obtained from a public repository. After preprocessing and data balancing, both models were trained and evaluated based on accuracy, precision, recall, and F1-score. The results show that XGBoost outperforms Decision Tree in all evaluation metrics, achieving an accuracy of 96.48%, precision of 94.75%, recall of 99.03%, and F1-score of 96.85%, compared to Decision Tree’s accuracy of 91.55%, precision of 89.82%, recall of 95.32%, and F1-score of 92.49%. These findings confirm that the combination of XGBoost and SMOTEENN provides a more effective and reliable classification model for early stroke diagnosis. Future research is encouraged to explore deep learning techniques to further enhance diagnostic accuracy.
Copyrights © 2025