Highlight: Advanced resampling techniques improved class balance in stroke datasets Gradient Boosting with SMOTE reached 92% accuracy with SHAP interpretability ABSTRACT Introduction: Stroke represents a significant global health concern, impacting millions worldwide and contributing substantially to morbidity and mortality. Early detection and accurate risk prediction remain critical for effective prevention strategies. Objective: This study aimed to improve stroke risk prediction by employing machine learning algorithms on health survey data to identify key predictors and enhance predictive performance. Method: A dataset derived from the National Health and Nutrition Examination Survey, comprising 4,603 participants, was utilized. The dataset exhibited class imbalance, with only 7.86% of individuals diagnosed with stroke. To address this imbalance, advanced resampling techniques, including SMOTE, SMOTETomek, and ADASYN, were applied. A range of tree-based algorithms was implemented, including Gradient Boosting, AdaBoost, XGBoost, and a Voting Classifier integrating Decision Tree, AdaBoost, and Gradient Boosting classifiers. Model evaluation included accuracy and AUC scores. Explainable Artificial Intelligence (XAI) analyses were conducted using SHAP (SHapley Additive exPlanations) to interpret feature importance. Result: The Gradient Boosting classifier, in conjunction with SMOTE, achieved the highest performance with an accuracy of 92% and an AUC score of 0.70. SHAP analysis identified age, general health condition, marital status, and BMI as the most influential predictors of stroke risk. Conclusion: This study underscores the essential need for ongoing advancements in early stroke detection methodologies. The findings highlight the transformative potential of machine learning and XAI in predictive healthcare, offering valuable insights for stroke prevention strategies.