This study evaluates the impact of the Synthetic Minority Oversampling Technique (SMOTE) on improving machine learning and deep learning performance in stroke risk classification using secondary, publicly available data from Kaggle’s Stroke Prediction Dataset (n = 5,110; 249 stroke cases, 4,861 non-stroke cases), for deep learning. Performance was measured using accuracy, precision, recall, and F1-score, while Explainable AI (XAI) methods (SHAP, LIME) were utilized for interpretability. The results show that applying SMOTE improves the model's sensitivity to the minority "Stroke" class, with Random Forest after SMOTE achieving 97% accuracy and a balanced precision–recall. These findings highlight the methodological potential of combining SMOTE with machine learning, deep learning, and XAI; however, they should not be interpreted as direct clinical validation. Future work with clinical and population-based datasets is necessary to assess the applicability in real-world healthcare settings.
Copyrights © 2025