Fajar Apriyadi
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Enhancing Stroke Prediction with Logistic Regression and Support Vector Machine Using Oversampling Techniques Risal, Syamsul; Fajar Apriyadi; A. Sumardin; Andini Dani Achmad; Annisa Nurul Puteri
Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) Vol 9 No 3 (2025): June 2025
Publisher : Ikatan Ahli Informatika Indonesia (IAII)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.29207/resti.v9i3.6431

Abstract

Stroke is a significant health concern that can result in both death and disability, making the early identification of risk factors crucial. Previous studies on stroke prediction have been limited by inadequate handling of class imbalance, lack of comprehensive feature selection, and parameter optimization, with accuracy rates usually below 80%. This study compares the performance of Logistic Regression (LR) and Support Vector Machine (SVM) algorithms combined with different oversampling methods—SMOTE, Borderline-SMOTE, ADASYN, Random Over Sampling (ROS), and Random Under Sampling (RUS)—on a stroke prediction dataset. Correlation-based feature selection identified age, hypertension, and heart disease as significant predictors. GridSearchCV with 10-fold cross-validation was used for hyperparameter optimization, and performance was evaluated using precision, recall, accuracy, and ROC curves. The results showed that SVM significantly outperformed Logistic Regression across all sampling methods. SVM+ROS achieved the highest performance with perfect recall (100%), precision of 97.18%, and accuracy of 98.56% (AUC: 0.9857), whereas SVM + Borderline-SMOTE offered balanced performance with a recall of 94.99%, precision of 95.06%, and accuracy of 95.17% (AUC: 0.9512). LR + Borderline-SMOTE performed the best with an accuracy of 84.98% (AUC: 0.8503), significantly better than previous studies. This improved accuracy shows significant clinical benefits, potentially reducing missed stroke diagnoses by identifying thousands of additional at-risk patients in large-scale screening programs. Healthcare providers should consider implementing SVM with ROS in critical care settings, where potentially missed stroke cases have severe consequences. Simultaneously, SVM with Borderline-SMOTE may be more appropriate for resource-constrained environments.