Anemia is one of the public health problems that requires serious attention, considering the relatively high percentage of anemia cases across various regions, including mild, moderate, and severe levels. To reduce the number of cases, a method capable of accurately predicting the risk of anemia is needed. This study aims to identify the most influential features in predicting the risk of anemia and to assess the performance of the LightGBM method in predicting this risk. The research process began with several stages: preprocessing, feature selection using the mutual information method, data balancing with SMOTE, parameter optimization via grid search, and evaluation of the LightGBM method on Complete Blood Count (CBC) data from hematology laboratory tests. The results indicate that the top 6 features out of the 16 in the original dataset are Hb, RBC, LYMP, HCT, MCV, and MCH. The application of the LightGBM method yielded optimal performance with an accuracy exceeding 97% and an AUC of 0.99. These values demonstrate that the LightGBM method possesses optimal capability in predicting the risk of anemia.
Copyrights © 2026