Garuda - Garba Rujukan Digital

INFOKUM

Vol. 10 No. 1 (2021): Desember, Data Mining, Image Processing, and artificial intelligence

Khairul Umam Syaliman (Politeknik Caltex Riau)

Publish Date
07 Dec 2021

k-Nearest Neighbor (k-NN) has very good accuracy results on data with almost the same class distribution, but on the contrary for information whose class distribution is not the same, the accuracy of k-NN will generally be lower. In addition, k-NN does not separate information for each class, implying that each class has an equal influence in determining the new information class, so it is important to choose a class that generally applies to information before characterizing the class assignments process. To overcome this problem, we will propose a structure that uses the Synthetic Minority Oversampling Technique (SMOTE) strategy to address class distribution problems and Gain Ratio (GR) to perform attribute selection to generate a new dataset with a reasonable class spread and significant class information attributes. E-Coli and Glass Identification were among the datasets used in this review. For objective results, the 10-fold-cross validation method will be used as an evaluation method with k values 1 to 10. The results of the research prove that SMOTE and GR can increase the accuracy of the k-NN method, where the highest increase occurred in the Glass Identification dataset by a difference increase of 18.5%. The lowest increase in accuracy occurred in the E-Coli dataset with an increase of 11.4%. The overall proposed method has given better performance, although the value of precision, recall, and F1-Score is not better than original k-NN when used in dataset E-Coli. To all datasets, an improvement from precision is 41.0%, recall is 43.4% and F1-Score is 41.5%.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

339.581 KB

Check in Google Scholar

Journal Info

INFOKUM

Website

Abbrev

infokum

Publisher

SEAN INSTITUTE

Subject

Computer Science & IT

Description

The INFOKUM a scientific journal of Decision support sistem , expert system and artificial inteligens which includes scholarly writings on pure research and applied research in the field of information systems and information technology as well as a review-general review of the development of the ...

Article Info

Abstract

Enhance the Accuracy of k-Nearest Neighbor (k-NN) for Unbalanced Class Data Using Synthetic Minority Oversampling Technique (SMOTE) and Gain Ratio (GR)

Article Info

Abstract