This research aims to evaluate and compare two popular machine learning algorithms, Logistic Regression and Decision Tree, in the context of credit scoring. The focus is on optimizing these models using techniques such as regularization, ensemble methods, and data balancing. The study emphasizes the challenges of data imbalance and multicollinearity in credit scoring, which can affect the accuracy of predictions. Logistic Regression, optimized with LASSO regularization, and Decision Tree, optimized with AdaBoost, were evaluated based on various performance metrics such as accuracy, precision, recall, F1-score, and ROC-AUC. The results indicate that Logistic Regression performed better in terms of accuracy, precision, and ROC-AUC, while Decision Tree with AdaBoost demonstrated superior recall, making it more effective in detecting high-risk borrowers. Additionally, the application of SMOTE (Synthetic Minority Over-sampling Technique) improved the models' ability to predict minority class outcomes, though it caused a slight reduction in precision for Logistic Regression. The findings suggest that Logistic Regression is ideal for institutions prioritizing model interpretability and stability, while Decision Tree with AdaBoost is better suited for those focusing on detecting at-risk borrowers in imbalanced datasets. This research contributes to the field of credit scoring by providing insights into the application of machine learning algorithms and optimization techniques in financial institutions.
Copyrights © 2025