Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics
Vol. 8 No. 1 (2026): February

The Effect of Smote-Tomek on the Classification of Chronic Diseases Based on Health and Lifestyle Data

Muhammad Adika Riswanda (Unknown)
Friska Abadi (Unknown)
Muhammad Itqan Mazdadi (Unknown)
Mohammad Reza Faisal (Unknown)
Rudy Herteno (Unknown)



Article Info

Publish Date
02 Mar 2026

Abstract

Machine learning models for chronic disease prediction are often trained on imbalanced healthcare datasets, where non-disease cases dominate. This condition can lead to misleadingly high accuracy while failing to identify patients with chronic diseases, limiting clinical usefulness. This study aims to analyze the impact of class imbalance on model performance and to evaluate the effectiveness of the SMOTE–Tomek resampling technique in improving chronic disease prediction. This research provides empirical evidence that accuracy alone is insufficient for evaluating healthcare models and demonstrates that imbalance-aware preprocessing is essential for valid and reliable chronic disease detection. Five classification models, such as Support Vector Machine, Random Forest, K-Nearest Neighbors, Gradient Boosting, and XGBoost, were evaluated on a lifestyle-based chronic disease dataset under two conditions: without resampling and with SMOTE–Tomek. Model performance was assessed using accuracy, precision, recall, F1-score, and AUC. Without SMOTE–Tomek, all models failed to detect chronic disease cases, producing near-zero recall and F1-scores despite accuracy exceeding 80%. After applying SMOTE–Tomek, substantial improvements were observed across all models, particularly in recall and AUC. Support Vector Machine achieved the best overall performance, with an accuracy of 92.9%, a precision of 92%, a recall of 93.9%, an F1-score of 0.93, and an AUC of 0.98. The findings confirm that handling class imbalance is a prerequisite for meaningful chronic disease prediction. The consistent increase in recall and AUC across all evaluated models confirms that the improvement stems from enhanced class separability rather than metric inflation. The proposed approach supports more reliable early screening and decision-support systems in preventive healthcare

Copyrights © 2026






Journal Info

Abbrev

ijeeemi

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management Electrical & Electronics Engineering Health Professions Materials Science & Nanotechnology

Description

Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics (IJEEEMI) publishes peer-reviewed, original research and review articles in an open-access format. Accepted articles span the full extent of the Electronics, Biomedical, and Medical Informatics. IJEEEMI seeks to ...