Jurnal Buana Informatika
Vol. 16 No. 01 (2025): Jurnal Buana Informatika, Volume 16, Nomor 01, April 2025

Importance of Feature Selection for Multiple Disease Classification

Andika, Rio Arya (Unknown)
Dewi, Christine (Unknown)



Article Info

Publish Date
01 Apr 2025

Abstract

The performance of machine learning in disease classification heavily depends on effective feature selection. This study explores feature selection methods—Boruta and Recursive Feature Elimination (RFE)—with ensemble models like Random Forest, Decision Tree, Gradient Boosting, LightGBM, and XGBoost using Electronic Health Records (EHR) data. Results show that combining Boruta with LightGBM achieves the highest accuracy of 99%. Feature selection enhances precision by focusing on relevant variables and removing unnecessary ones. Further analysis reveals that features such as Red Blood Cells, Insulin, Heart Rate, and Cholesterol significantly influence the classification of specific diseases. These findings highlight the importance of feature selection in multi-disease classification and medical data analysis, improving the efficiency of machine learning systems. Future research should develop more flexible feature selection methods and test models on diverse disease datasets.

Copyrights © 2025