Classification is a part of data mining that aims to predict the class of data using a trained machine learning model. K-Nearest Neighbor (K-NN) is one of the classification methods that uses the concept of distance to the nearest neighbor in creating classification models. However, K-NN has limitations in handling imbalanced class distributions. This core problem can be addressed by applying a class balancing technique. One such technique is the Synthetic Minority Oversampling Technique for Nominal and Continuous (SMOTE-NC), which is suitable for datasets containing both nominal and continuous variables. The aim of this research is to classify Honda motorcycle loan customer data at Company Z using the K-NN method combined with SMOTE-NC to address data imbalance. This research method is experimental, using a 10-fold cross-validation approach to partition training and testing data. The input variables include gender, occupation, length of installment, income, installment amount, motorcycle price, and down payment, while the output variable is payment status (current or non-current). The results of this research are: the optimal K value for classification using K-NN with SMOTE-NC is K = 1, with an average APER (Average Probability of Error Rate) of 0.143. The best result is found in subset 8 with an APER value of 0.033. In this subset, out of 61 data points, 34 current-status customers are correctly classified as current, and 25 non-current-status customers are correctly classified as non-current, with only one misclassification in each class. The conclusion of this study is that the combination of SMOTE-NC and K-NN (K=1) provides high classification accuracy for imbalanced data, and can be effectively used to support credit risk assessment in motorcycle financing.
Copyrights © 2025