Claim Missing Document
Check
Articles

Found 11 Documents
Search

Optimization of Heart Failure Classification on Imbalanced Data Using a Supervised Learning Approach Based on Logistic Regression, Random Forest, and K-Nearest Neighbor: Optimalisasi Klasifikasi Gagal Jantung pada Data Imbalanced Menggunakan Pendekatan Supervised Learning Berbasis Regresi Logistik, Random Forest, dan K-Nearest Neighbor agustina, feri; Irawan, Candra; Erawan, Lalang; Suprayogi; Award Widya Laksana, Deddy; Jatmoko, Cahaya; Sinaga, Daurat; Lestiawan, Heru
Jurnal Informatika Polinema Vol. 12 No. 1 (2025): Vol. 12 No. 1 (2025)
Publisher : UPT P2M State Polytechnic of Malang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.33795/jip.v12i1.9071

Abstract

Heart failure remains one of the leading causes of mortality worldwide, posing significant challenges for early diagnosis and patient management. One of the major obstacles in developing predictive models for heart failure is the class imbalance problem, where the number of surviving patients far exceeds those who experience death events. This imbalance often leads machine learning algorithms to bias toward the majority class, reducing sensitivity to critical minority cases. To address this issue, this study applies the Synthetic Minority Oversampling Technique (SMOTE) to balance the dataset and improve model performance. Three supervised learning algorithms, namely Logistic Regression (LR), Random Forest (RF), and K-Nearest Neighbor (KNN), were implemented and compared on the UCI Heart Failure Clinical Records dataset containing 299 patient samples with 13 clinical attributes. Experimental results show that the Random Forest model achieved the highest performance with 90% accuracy, precision, recall, and F1-score, outperforming both LR and KNN. The findings demonstrate that combining data balancing with ensemble learning effectively enhances prediction accuracy and sensitivity toward minority classes. The main contribution of this research lies in optimizing supervised models for medical data with skewed class distributions, providing a more reliable and interpretable approach for early heart failure detection. Future research may extend this work by integrating advanced ensemble or hybrid deep learning models and expanding the dataset for multi-institutional validation