Mahda Nurayuni
Program Studi Sistem Informasi, STMIK Muhammadiyah Paguyangan Brebes, Brebes, Jawa Tengah 52276, Indonesia

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Optimasi Algoritma K-Nearest Neighbors Berdasarkan Perbandingan Analisis Outlier (Berbasis Jarak, Kepadatan, LOF) Fitri Ayuning Tyas; Mahda Nurayuni; Hidayatur Rakhmawati
Jurnal Nasional Teknik Elektro dan Teknologi Informasi Vol 13 No 2: Mei 2024
Publisher : Departemen Teknik Elektro dan Teknologi Informasi, Fakultas Teknik, Universitas Gadjah Mada

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.22146/jnteti.v13i2.9579

Abstract

The current data growth affects data analysis in various fields, such as astronomy, business, medicine, education, and finance. The collected and stored data contain extreme values or observation values different from most other observation value results. These extreme values are called outliers. Outliers on some data often hold valuable information, necessitating thorough examination to determine whether to retain or discard them prior to data mining application. Outlier detection can be performed as a part of data preprocessing using outlier analysis techniques. Commonly utilized outlier analysis techniques encompass distance-based methods, density-based methods, and the local outlier factor (LOF) method. k-nearest neighbors (KNN) are a data mining algorithm susceptible to outliers due to its reliance on the value of k. Hence, having an appropriate handling mechanism is essential when employing KNN on datasets that contain outliers. The experimental method was selected to apply the proposed approach, aiming to optimize the KNN algorithm through a comparison of outlier analysis methods (KNN-distance, KNN-density, and KNN-LOF). The results revealed that KNN-density outperformed the others significantly: achieving an average accuracy of 99.34% at k=3 and k=5 for Wisconsin Breast Cancer, 85.25% at k=7 for Glass, and 85.45% at k=5 for Lymphography. Moreover, both the Friedman and Nemenyi tests validate a notable distinction between KNN-density and KNN-LOF.