Attaufiqqurrohman, Hadit
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

ALGORITMA RANDOM FOREST UNTUK PREDIKSI STATUS PINJAMAN BERDASARKAN SKOR KREDIT Attaufiqqurrohman, Hadit; Ade Irma Purnamasari; Denni Pratama; Nining Rahaningsih; Willy Prihartono
METHODIKA: Jurnal Teknik Informatika dan Sistem Informasi Vol. 12 No. 1 (2026): Volume 12 Nomor 1 Tahun 2026
Publisher : Universitas Methodist Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

The rapid development of financial technology has encouraged financial institutions to adopt data-driven credit scoring systems in order to minimize the risk of default. However, many loan eligibility prediction models still face challenges such as data imbalance (class imbalance) and the limited capability of traditional models to capture non-linear relationships among variables. This study aims to develop a loan status prediction model using the Random Forest algorithm combined with the Synthetic Minority Oversampling Technique (SMOTE) and One-Hot Encoding (OHE) to improve model accuracy and generalization capability. The data used in this study are secondary data obtained from the public Kaggle platform, consisting of 45,000 records with 14 demographic and financial attributes. The research method employs a supervised learning approach with several stages, including data acquisition and preprocessing (data cleaning, normalization, encoding, and data balancing), Random Forest model training, and performance evaluation using accuracy, precision, recall, F1-score, and AUC metrics. The results show that the combination of Random Forest, SMOTE, and OHE achieves high predictive performance, with an accuracy of 94.8%, precision of 95.6%, recall of 93.7%, F1-score of 94.6%, and an AUC value of 0.972. The most influential variables in loan status prediction are credit_score, person_income, and loan_amnt. This approach is proven to be effective in addressing data imbalance issues and improving classification accuracy in identifying creditworthy and non-creditworthy borrowers.