Jurnal: International Journal of Engineering and Computer Science Applications (IJECSA)
Vol. 4 No. 2 (2025): September 2025

Handling Imbalanced Data in K-Nearest Neighbor Algorithm using Synthetic Minority Oversampling Technique-Nominal Continuous

Anjani Anjani (Unknown)
Hayati, Memi Nor (Unknown)
Surya Prangga (Unknown)



Article Info

Publish Date
02 Jul 2025

Abstract

Classification is a part of data mining that aims to predict the class of data using a trained machine learning model. K-Nearest Neighbor (K-NN) is one of the classification methods that uses the concept of distance to the nearest neighbor in creating classification models. However, K-NN has limitations in handling imbalanced class distributions. This core problem can be addressed by applying a class balancing technique. One such technique is the Synthetic Minority Oversampling Technique for Nominal and Continuous (SMOTE-NC), which is suitable for datasets containing both nominal and continuous variables. The aim of this research is to classify Honda motorcycle loan customer data at Company Z using the K-NN method combined with SMOTE-NC to address data imbalance. This research method is experimental, using a 10-fold cross-validation approach to partition training and testing data. The input variables include gender, occupation, length of installment, income, installment amount, motorcycle price, and down payment, while the output variable is payment status (current or non-current). The results of this research are: the optimal K value for classification using K-NN with SMOTE-NC is K = 1, with an average APER (Average Probability of Error Rate) of 0.143. The best result is found in subset 8 with an APER value of 0.033. In this subset, out of 61 data points, 34 current-status customers are correctly classified as current, and 25 non-current-status customers are correctly classified as non-current, with only one misclassification in each class. The conclusion of this study is that the combination of SMOTE-NC and K-NN (K=1) provides high classification accuracy for imbalanced data, and can be effectively used to support credit risk assessment in motorcycle financing.  

Copyrights © 2025






Journal Info

Abbrev

IJECSA

Publisher

Subject

Computer Science & IT

Description

Description of Journal : The International Journal of Engineering and Computer Science Applications (IJECSA) is a scientific journal that was born as a forum to facilitate scientists, especially in the field of computer science, to publish their research papers. The 12th of the 12th month of 2021 is ...