PIKSEL : Penelitian Ilmu Komputer Sistem Embedded and Logic
Vol. 14 No. 1 (2026): March 2026

A Predictive Model for Type 2 Diabetes Using A Wrapper-Based Feature Selection Method

Khairunisa Hilyati (Unknown)
Nuciko Abdul Halim (Universitas Budi Luhur)
Wendi Usino (Budi Luhur University)



Article Info

Publish Date
31 Mar 2026

Abstract

Diabetes mellitus continues to show a rising global prevalence, making early detection of diabetes risk essential to prevent serious complications. This research aims to evaluate the effectiveness of a wrapper-based feature selection technique in improving the performance of classification models for early-stage diabetes risk prediction. The feature selection method employed is Recursive Feature Elimination (RFE), which is combined with three classification algorithms: Random Forest, Support Vector Machine (SVM), and Logistic Regression. The dataset used in this research was obtained from RSUD Pemangkat, Sambas Regency, West Kalimantan. The implementation of RFE is expected to identify and eliminate less relevant features, thereby simplifying the model, enhancing interpretability, and improving efficiency without compromising accuracy. This approach is particularly important in medical data analysis, where datasets are often complex and contain numerous clinical variables. Model performance is evaluated using accuracy, F1-score, and Area Under the Curve (AUC) to ensure a comprehensive assessment of classification capability. A comparative analysis is conducted to determine the optimal combination of feature selection method and classification algorithm that yields the best performance. In the scenario of applying the model with all features (baseline), Random Forest showed the best performance compared to other algorithms with an accuracy value of 0.9909, F1-Score of 0.9927, AUC of 0.9995, and sensitivity (recall) of 1.0000, which indicates that all cases of diabetes in the test data were successfully detected without false negative errors. SVM and Logistic Regression produced accuracies of 0.9545 and 0.9273, respectively. Despite having good classification capabilities, SVM tends to produce higher false positives, while Logistic Regression excels in the aspect of model interpretability. With an optimized model, the system has the potential to assist healthcare professionals in screening processes and clinical decision-making more quickly and effectively

Copyrights © 2026






Journal Info

Abbrev

piksel

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management

Description

Jurnal PIKSEL diterbitkan oleh Universitas Islam 45 Bekasi untuk mewadahi hasil penelitian di bidang komputer dan informatika. Jurnal ini pertama kali diterbitkan pada tahun 2013 dengan masa terbit 2 kali dalam setahun yaitu pada bulan Januari dan September. Mulai tahun 2014, Jurnal PIKSEL mengalami ...