Pratama, Imam Bagus
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Optimizing Stroke Prediction Using Backward Elimination and SMOTE with C4.5 and K-Nearest Neighbors Pratama, Imam Bagus; Fanani, Ahmad Zainul; Soeleman, M. Arief; Kumalasari, Via Indriani
Journal of Information System and Informatics Vol 8 No 2 (2026): April
Publisher : Asosiasi Doktor Sistem Informasi Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.63158/journalisi.v8i2.1521

Abstract

Early prediction of stroke risk is crucial for reducing mortality and the burden on the healthcare system, but class imbalance and irrelevant features often compromise model reliability. This study analyzes the impact of Backward Elimination and SMOTE on the performance of the C4.5 and K-NN algorithms in stroke prediction. The study used a fixed working subset of 1,239 data points and evaluated four modeling scenarios using Stratified 10-Fold Cross Validation. Model performance was measured using accuracy, precision, recall, F1-score, and AUC. The results showed that Backward Elimination improved model performance on the analyzed subsets. For C4.5, accuracy increased from 70.94% to 73.05%, stroke recall from 83.94% to 85.14%, and AUC from 0.776 to 0.806. For K-NN, accuracy increased from 72.31% to 74.82% and precision from 39.91% to 42.73%, while stroke recall remained relatively stable at 74.30%. These findings indicate that although the improvements are small numerically, the results remain practically relevant as they enhance the balance between sensitivity and class discrimination capability. In the context of stroke screening, reducing false negatives is more important because it helps minimize undetected high-risk cases, although false positives still need to be considered as a consequence of further testing. Overall, C4.5 with Backward Elimination demonstrates more balanced performance, although the results are still limited to the analyzed subset.