Denny Indrajaya
Departemen Matematika dan Sains Data, Fakultas Sains dan Matematika, Universitas Kristen Satya Wacana, Salatiga, Jawa Tengah 50711

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparison of k-Nearest Neighbor and Naive Bayes Methods for SNP Data Classification Denny Indrajaya; Adi Setiawan; Bambang Susanto
MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer Vol 22 No 1 (2022)
Publisher : LPPM Universitas Bumigora

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30812/matrik.v22i1.1758

Abstract

In an accident, sometimes the identity of a person who has an accident is hard to know, so it is necessary to use biological data such as Single Nucleotide Polymorphism (SNP) data to identify the person's origin. This research aims to compare the accuracy and the F1 score of the k-Nearest Neighbor method and the Naive Bayes method in classifying SNP data from 120 people who divide into groups, namely European (CEU) and Yoruba (YRI). Determination of the best method based on the average value of accuracy and the average value of F1 score from 1000 iterations with various percentage distributions of training datasets and testing datasets. In this research, the selection of SNP locations for the classification process was carried out by correlation analysis. The average accuracy obtained for the k-Nearest Neighbor method with the value of k=31 is 98.38% where the average F1 score is 98.39% while the Naive Bayes method obtained the average accuracy of 96.74% and the average F1 score of 96.63%. In this case, the k-Nearest Neighbor method is better than the Naive Bayes method in classifying SNP data to determine the origin of a person's ancestor tends to be from CEU or YRI.