This study focuses on predicting the probability of customer default in consumer loan products using historical customer behavior data. The dataset includes information such as income, age, work experience, marital status, home ownership, car ownership, and others. Several techniques such as Principal Component Analysis (PCA) are used to reduce the dimensions of correlated features. Various machine learning algorithms are applied, such as Logistic Regression, K-Nearest Neighbors (KNN), Random Forest, XGBoost, and others. The results show that the XGBoost model provides the best performance with the highest AUC Score even though Random Forest provides the best accuracy performance.
Copyrights © 2025