UNP Journal of Statistics and Data Science
Vol. 4 No. 2 (2026): UNP Journal of Statistics and Data Science

Hybrid LBFA-Based Feature Selection for Improving Machine Learning Classification Performance in Heart Disease Prediction

Hana Azizah (Unknown)
Eni Sumarminingsih (Universitas Brawijaya)
Adji Achmad Rinaldo Fernandes (Universitas Brawijaya)



Article Info

Publish Date
31 May 2026

Abstract

Feature selection and feature engineering are essential steps in developing accurate machine learning models, particularly when dealing with imbalanced datasets and redundant variables. However, many feature augmentation methods are often applied without a consistent preprocessing strategy, which can reduce model reliability and increase the risk of information leakage. To overcome this issue, this study proposes a hybrid classification framework that combines CatBoost-based feature selection with two feature augmentation techniques: LOGIT transformation and Log Density Ratio (LDR). A structured preprocessing pipeline was designed to ensure consistency throughout the modeling process. One-hot encoding was applied for the LOGIT transformation, while numerical standardization was used for LDR estimation. The generated features were then integrated with the selected original variables to produce richer feature representations for classification. The proposed framework was evaluated using the Heart Disease dataset with three gradient boosting algorithms, namely LightGBM, XGBoost, and CatBoost. Model performance was assessed using accuracy, precision, sensitivity, specificity, and F1-score. The results show that the proposed approach consistently improved classification performance across all models. Among the tested models, LightGBM combined with LOGIT and LDR achieved the best performance, obtaining an accuracy of 0.9618, precision of 0.9485, sensitivity of 0.9620, specificity of 0.9625, and F1-score of 0.9552. These findings suggest that combining feature selection with structured feature augmentation can significantly improve predictive performance in imbalanced classification tasks

Copyrights © 2026






Journal Info

Abbrev

ujsds

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management Mathematics Social Sciences

Description

UNP Journal of Statistics and Data Science is an open access journal (e-journal) launched in 2022 by Department of Statistics, Faculty of Science and Mathematics, Universitas Negeri Padang. UJSDS publishes scientific articles on various aspects related to Statistics, Data Science, and its ...