Enthusiastic : International Journal of Applied Statistics and Data Science
Volume 4 Issue 2, October 2024

Loan Approval Classification Using Ensemble Learning on Imbalanced Data

Anadra, Rahmi (Unknown)
Sadik, Kusman (Unknown)
Soleh, Agus M (Unknown)
Astari, Reka Agustia (Unknown)



Article Info

Publish Date
01 Oct 2024

Abstract

Loan processing is an important aspect of the financial industry, where the right decisions must be made to determine loan approval or rejection. However, the issue of default by loan applicants has become a significant concern for financial institutions. Hence, ensemble learning needs to be used with random forest and Extreme Gradient Boosting (XGBoost) algorithms. Unbalanced data are handled using the Synthetic Minority Over-sampling Technique (SMOTE). This research aimed to improve accuracy and precision in credit risk assessment to reduce human workload. Both algorithms used a dataset of 4,296 with 13 variables relevant to making loan approval decisions. The research process involved data exploration, data preprocessing, data sharing, model training, model evaluation with accuracy, sensitivity, specificity, and F1-score, model selection with 10-fold cross-validation, and important variables. The results showed that XGBoost with imbalanced data handling had the highest accuracy rate of 98.52% and a good balance between sensitivity of 98.83%, specificity of 98.01, and F1-score of 98.81%. The most important variables in determining loan approval are credit score, loan term, loan amount, and annual income.

Copyrights © 2024






Journal Info

Abbrev

ENTHUSIASTIC

Publisher

Subject

Computer Science & IT Decision Sciences, Operations Research & Management Economics, Econometrics & Finance Engineering Mathematics

Description

ENTHUSIASTIC is an international journal published by the Statistics Department, Faculty of Mathematics and Natural Sciences, Universitas Islam Indonesia. ENTHUSIASTIC publishes original research articles or review articles on all aspects of the statistics and data science field which should be ...