Cancer is the most deadly disease besides heart disease. A common cause of cancer is gene mutation in protein 53 that serves to control the replication of DNA as a regulator of the cell function resulting in the wrong protein sequence. The protein sequences is used as a basis to classifying the types of cancer and then it can ease in determining the right handling or therapeutics method. The classification of cancer using the Fuzzy k-Nearest Neighbor (Fk-NN) method. The data used are 752 protein sequences with 393 sequence length on every sequence. The classification class includes non-cancer, breast cancer, collorectal cancer and lung cancer. The Fk-NN method calculates the degree of membership of each class at the k smallest distances generated from k-Nearest Neighbor method. The highest average accuracy rate is 52.56% of the test results using k-fold-validation. The optimal k value of the Fk-NN method is k = 5 with the average accuracy rate of 54.99%. The large variation in the amount of training data that is 90% of the dataset results in the highest accuracy rate of 55.33%.
Copyrights © 2019