Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Global Science: Journal of Information Technology and Computer Science

Benchmarking Machine Learning Models for Large-Scale Loan Default Prediction Using Real Data Devianto, Yudo; Saragih, Rusmin; Cahyana, Yana
Global Science: Journal of Information Technology and Computer Science Vol. 2 No. 1 (2026): March: Global Science: Journal of Information Technology and Computer Science
Publisher : International Forum of Researchers and Lecturers

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.70062/globalscience.v2i1.181

Abstract

This research benchmarks multiple machine learning (ML) algorithms for large-scale loan default prediction using a real-world dataset of 255,000 borrower records, where default cases represent only ~9–12% of total observations. The study addresses the persistent gap in comparative analyses of ML models that balance predictive accuracy, interpretability, and computational efficiency for credit risk assessment. Six algorithmic families were evaluated Logistic Regression, Random Forest, XGBoost, LightGBM, CatBoost, Artificial Neural Networks (ANN), and Stacked Ensemble—using standardized preprocessing, hybrid imbalance handling (SMOTE, class weighting, under-sampling), and comprehensive evaluation metrics (AUC, F1, Recall, Precision, PR-AUC, and Brier Score). Empirical results show Logistic Regression achieved the highest AUC of 0.732, outperforming nonlinear models under the baseline configuration, while LightGBM attained perfect recall (1.0) but low precision (0.116), indicating over-prediction of defaults. Gradient boosting models demonstrated robust calibration (Brier ≈ 0.114–0.116) and the best computational efficiency, with LightGBM showing the fastest training and lowest memory use. CatBoost exhibited strong recall but the slowest computation, and ANN underperformed on tabular data (AUC ≈ 0.56). The Stacked Ensemble delivered balanced results with AUC = 0.664 and improved overall stability. These findings confirm that boosting-based models, particularly LightGBM and CatBoost, offer superior scalability and calibration, whereas Logistic Regression remains a valuable interpretable baseline. The study concludes that effective default prediction requires integrating rebalancing, calibration, and threshold optimization to enhance recall and operational deployment reliability in large-scale credit ecosystems.
Co-Authors Abda Abda Abdullah Darussalam Addion Nizori Adi Rizky Pratama Adi Susilo Aenul Fuadah Agung Triatna Agustin, Rachmayanti Tri Ahmad Fauzi Alifa, Naila Ratu Ambarwati, Evi Karlina Amid Rakhman amril siregar Anisa Itiawanti Annisa Nurhalizah Aqib Zhaky Arum Galih Pertiwi Awal, Elsa Elvira Ayu Juwita Baihaqi, Kiki Ahmad Banafshah Shafa Bramandito Affandi Budiyanto Budiyanto Deden Wahiddin Dewi, Indah Purnama Didik Remaldhi Direja, Azhar Ferbista Duhita D Utama DWI KUSUMANINGRUM Een Sukarminah Efri Mardawati Enjelia, Lola Faisal, Sutan Fauzan Azima Fauzi Ahmad Muda Fitri Nur Masruriyah, Anis Fitria, Denisa Gumilar, Rizki Bintang Hanan, Sofiah Marwah Hanny Hikmayanti Handayani Hartono Wijaya, Sony Heri Hermawan Herlina Marta Hilda Novita Humaryanto, Humaryanto Iis Sadiah Imas Siti Setiasih In-In Hanidah Indira Lanti Kayaputri Indra Lasmana Tarigan Iskandar, Muhammad Irsyad Jovan Pangestu Juwita, Ayu Ratna Kiki Baihaqi Kusumaningrum, Dwi Sulistya Lestari, Santi Arum Puspita M. Budi Kusarpoko M. Naufal Faqih Madyawati Latief Marsetio Marsetio Melia Siti Ajijah Miptahul Ulum Mochamad Djali Mohammad Djali Mohammad Djali Mohammad Djali Mohammad Djali Mudzakir, Tohirin Al Muhamad Amirrullah Muhammad Fadillah, Farhan Muhammad Ramadhan Mursyid Djawas Narwan Nahrudin Nina Puspitaloka Nofie Prasetiyo Nova Wulandari Praditya Putri Utami Pratama, Adi Rizky Pratiwi, Sinta Amanda Putra Rizki Pangestu Putri, Septiani Nuruldharma Rachmawati, Dhea Raden Duhita Diantiparamudita Utama Rahmat Rahmat Rahmat Rahmat Rahmat Restiana, Resti Ricky Steven Chandra Ridho Pratama, Ilham Ridwan, Ridwan Rizka Ayu Permana Rizki Ananda Rizki Nur Annisa Rizky Nugraha Rizky Riyanto Robi Andoyo Rohana, Tatang Rossi Indiarto Rusmin Saragih, Rusmin Sabirin Sandra Intan Sari Santi Lestari Seow, Eng Keng Siregar, Amril Siregar, Amril Mutoi Siregar, Amril Mutoi Siti Hanifah Khairun Nisa Suci Rahma Ajiaviaty Sukmawati, Cici Emilia Sulistya, Dwi Suningwar Mujiana Surya Martha Pratiwi Sutan Faisal Syahril, Ade Tatang Rohana Tita Rialita Tjong Wan Sen Tohirin Al Mudzakir Tsani Adiyanti Tukino, Tukino Wahiddin, Deden Wahyu Setio Aji Wazzan, Huda Wenda Adi Kusnaya Widiharto, Banani Yudo Devianto