Claim Missing Document
Check
Articles

Found 11 Documents
Search

Comparison Of The Performance Of K-Nearest Neighbors And Naive Bayes Algorithms For Stroke Disease Prediction baskoro, baskoro; Novianto, Roby; Triraharjo, Bambang
Jurnal CoreIT: Jurnal Hasil Penelitian Ilmu Komputer dan Teknologi Informasi Vol 11, No 2 (2025): December 2025
Publisher : Fakultas Sains dan Teknologi, Universitas Islam Negeri Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24014/coreit.v11i2.37542

Abstract

Purpose: Stroke is a critical global health issue requiring early and accurate prediction to mitigate severe outcomes. This study aims to compare the performance of the K-Nearest Neighbors (KNN) and Naive Bayes algorithms in predicting stroke disease, addressing the challenge of imbalanced datasets and improving prediction accuracy for better clinical decision-making.Methods/Study design/approach: The research followed the CRISP-DM model, utilizing a dataset of 5,110 patient records with 12 attributes from Kaggle. Data preprocessing included handling missing values and normalization. The KNN and Naive Bayes algorithms were implemented using RapidMiner, with performance evaluated through cross-validation, confusion matrices, and ROC-AUC curves.Result/Findings: The KNN algorithm achieved an accuracy of 94.50%, but exhibited low precision (7.89%) and recall (1.20%) for stroke-positive cases due to dataset imbalance. Naive Bayes yielded an accuracy of 88.83% with an AUC of 0.767, demonstrating better probability modeling but similar challenges in minority class detection. Both algorithms highlighted the impact of data imbalance on predictive performance.Novelty/Originality/Value: This study provides a comparative analysis of KNN and Naive Bayes for stroke prediction, emphasizing the need for data balancing and optimization techniques. The findings underscore the potential of these algorithms in healthcare applications while suggesting future improvements through ensemble methods or alternative algorithms like Random Forest.