Claim Missing Document
Check
Articles

Found 2 Documents
Search

Nearest Centroid Classifier with Outlier Removal for Classification Bawono, Aditya Hari; Bahtiar, Fitra Abdurrahman; Supianto, Ahmad Afif
Journal of Information Technology and Computer Science Vol. 5 No. 1: April 2020
Publisher : Faculty of Computer Science (FILKOM) Brawijaya University

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (1055.247 KB) | DOI: 10.25126/jitecs.202051162

Abstract

Classification method is misled by outlier. However, there are few research of classification with outlier removal, especially for Nearest Centroid Classifier Method. The proposed methodology consists of two stages. First, preprocess the data with outlier removal, removes points which are far from the corresponding centroid. Second, classify the outlier removed data. The experiment covers six data sets which have different characteristic. The results indicate that outlier removal as preprocessing method provide better result for improving Nearest Centroid Classifier performance on most data set.
Efisiensi Big Data Menggunakan Improved Nearest Neighbor Bawono, Aditya Hari; Supianto, Ahmad Afif
Jurnal Teknologi Informasi dan Ilmu Komputer Vol 6 No 6: Desember 2019
Publisher : Fakultas Ilmu Komputer, Universitas Brawijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (2389.616 KB) | DOI: 10.25126/jtiik.2019662085

Abstract

Klasifikasi adalah salah satu metode penting dalam kajian data mining. Salah satu metode klasifikasi yang populer dan mendasar adalah k-nearest neighbor (kNN). Pada kNN, hubungan antar sampel diukur berdasarkan tingkat kesamaan yang direpresentasikan sebagai jarak. Pada kasus mayoritas terutama pada data berukuran besar, akan terdapat beberapa sampel yang memiliki jarak yang sama namun amat mungkin tidak terpilih menjadi tetangga, maka pemilihan parameter k akan sangat mempengaruhi hasil klasifikasi kNN. Selain itu, pengurutan pada kNN menjadi masalah komputasi ketika dilakukan pada data berukuran besar. Dalam usaha mengatasi klasifikasi data berukuran besar dibutuhkan metode yang lebih akurat dan efisien. Dependent Nearest Neighbor (dNN) sebagai metode yang diajukan dalam penelitian ini tidak menggunakan parameter k dan tidak ada proses pengurutan sampel. Hasil percobaan menunjukkan bahwa dNN dapat menghasilkan efisiensi waktu sebesar 3 kali lipat lebih cepat daripada kNN. Perbandingan akurasi dNN adalah 13% lebih baik daripada kNN.AbstractClassification is one of the important methods of data mining. One of the most popular and basic classification methods is k-nearest neighbor (kNN). In kNN, the relationships between samples are measured by the degree of similarity represented as distance. In major cases, especially on big data, there will be some samples that have the same distance but may not be selected as neighbors, then the selection of k parameters will greatly affect the results of kNN classification. Sorting phase of kNN becomes a computation problem when it is done on big data. In the effort to overcome the classification of big data a more accurate and efficient method is required. Dependent Nearest Neighbor (dNN) as method proposed in this study did not use the k parameters and no sample at the sorting phase. The proposed method resulted in 3 times faster than kNN. The accuracy of the proposed method is13% better results than kNN.