Credit scoring is an important process in the financial world to accurately identify the risk of prospective debtors. This study aims to build a credit scoring model by comparing the performance of several machine learning algorithms, namely XGBoost, random forest, and logistic regression. The dataset used is part of the training data and test data with several ratios, namely 70:30, 75:25, and 80:20. The preparation process is carried out through variable selection using the 5C principle, missing value imputation, categorical transformation of variables, and creation of derived features. Furthermore, modeling and optimization are carried out for each model to improve classification performance, especially in recognizing debtors who have the potential to default. The evaluation results show that the XGBoost model has the best performance with an accuracy of 84.8%, a precision of 84.9% and a recall of 84.7%, and an AUC of 92.3%. The main assessment of the character principle is the external credit score variable, the main assessment of the capacity principle is income, the main assessment of the capital principle is car ownership, the main assessment of the collateral principle is the credit financing ratio, and the main assessment of the condition principle is the regional rating.
Copyrights © 2025