International Journal of Computer Science and Humanitarian AI
Vol. 2 No. 2 (2025): IJCSHAI

Implementation of Random Forest Algorithm in Handling Imbalanced Data: A Study on Default Models and Hyperparameter Tuning

Ivan William Lianata (Bina Nusantara University)
Kang Nicholas Darren Nugroho (Bina Nusantara University)
Yosua Nathanael (Bina Nusantara University)
Neilson Christopher (Bina Nusantara University)
Edy Irwansyah (Bina Nusantara University)



Article Info

Publish Date
31 Oct 2025

Abstract

The healthcare industry has benefited greatly from the quick development of artificial intelligence, especially machine learning (ML). Unbalanced data is a significant problem in medical classification, as it can impair model performance, particularly when it comes to identifying important minority classes like patients with particular diseases. The purpose of this research is to evaluate how well two ensemble-based algorithms—Random Forest and Gradient Boosting—perform when dealing with data imbalance in diabetes prediction. Age, body mass index, smoking history, HbA1c level, blood glucose level, and other demographic and medical variables are included in the dataset, which was acquired from Kaggle. Data preprocessing, train-test splitting, model implementation with default parameters, and hyperparameter tuning with Grid Search and Cross Validation comprise the methodology. Accuracy, precision, recall, F1-score, and AUC-ROC metrics were used to assess the model's performance. Both models achieved high accuracy above 97%, according to the results. Following tuning, Random Forest achieved 97.06% accuracy, 0.974 AUC, and 0.99 positive-class precision; however, recall somewhat declined, possibly resulting in underdiagnosis. Gradient Boosting, on the other hand, showed consistent performance with an AUC of 0.9791 and an F1-score of 0.81. These results demonstrate that model performance can be enhanced by hyperparameter tuning; however, algorithm selection should be based on the needs of the application, especially in medical settings where striking a balance between sensitivity and diagnostic precision is crucial.

Copyrights © 2025






Journal Info

Abbrev

ijcshai

Publisher

Subject

Humanities Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Engineering

Description

International Journal of Computer Science and Humanitarian AI (IJCSHAI) is an international journal published biannually in February and October. The Journal focuses on various issues: Computer Science, Artificial Intelligence (AI), Fuzzy Systems, Expert Systems, Geo-AI, Machine Learning, Deep ...