KREATIF: Jurnal Pengabdian Masyarakat Nusantara
Vol. 5 No. 3 (2025): Jurnal Pengabdian Masyarakat Nusantara

Implementasi Data Mining untuk Klasifikasi Penyakit Stroke Menggunakan Algoritma K-Nearest Neighbor

Enkan Feny Nopitasari (Universitas Muhammadiyah Pontianak,)
Syarifah Putri Agustini Alkadri (Universitas Muhammadiyah Pontianak,)
Rachmat Wahid Saleh Insani (Universitas Muhammadiyah Pontianak)



Article Info

Publish Date
02 Sep 2025

Abstract

Stroke remains a major global health challenge, with diagnoses often delayed, particularly in primary care facilities with limited infrastructure. This study aimed to develop a stroke risk classification system using the K-Nearest Neighbor (KNN) algorithm, optimized through comprehensive data preprocessing. A secondary dataset of 5,110 patient records was processed using mean imputation for missing BMI values, winsorization to manage outliers, label encoding for categorical variables, and Min-Max normalization for feature scaling. To address class imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied prior to stratified data splitting into 70% training and 30% testing sets. The KNN model with K=5 demonstrated strong performance, achieving 96% accuracy, 96% precision, 99% recall, and a 97% F1-score on the test data. Multivariate correlation analysis identified age, hypertension, and blood glucose levels as the primary predictors of stroke risk, consistent with established clinical pathophysiology. These findings highlight the critical role of cardiometabolic risk factors in early detection. The system was implemented as a web application using Streamlit, enabling rapid and interactive screening in primary healthcare centers with minimal infrastructure. This practical application has the potential to assist healthcare providers in early stroke detection, accelerating clinical intervention and reducing the likelihood of long-term complications. Nevertheless, several limitations exist. The reliance on secondary data introduces the possibility of regional bias, and the use of SMOTE generates synthetic data that may affect model generalizability. Future research is recommended to validate the model across multi-source datasets, apply advanced hyperparameter tuning, and explore ensemble learning techniques to further enhance predictive reliability. In conclusion, the KNN-based classification system demonstrates promising potential as a practical decision-support tool for early stroke risk assessment in resource-limited healthcare settings.

Copyrights © 2025






Journal Info

Abbrev

kreatif

Publisher

Subject

Economics, Econometrics & Finance Education Engineering Health Professions Law, Crime, Criminology & Criminal Justice

Description

Jurnal KREATIF memuat publikasi hasil kegiatan pengabdian masyarakat, model atau konsep dan atau implementasinya dalam rangka peningkatan partisipasi masyarakat dalam pembangunan, pemberdayaan masyarakat atau pelaksanaan pengabdian kepada masyarakat. KREATIF: Jurnal Pengabdian Masyarakat Nusantara, ...