Mohd Farhan MD Fudzee, Mohd Farhan
Tun Hussein Onn University

Published : 4 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : International Journal of Advanced Science Computing and Engineering

Feature Extraction and Classification On Single Nucleotide Polymorphism Kamarudin, Nur Fatihah; Ali Shah, Zuraini; Md Fudzee, Mohd Farhan; Kasim, Shahreen
International Journal of Advanced Science Computing and Engineering Vol. 1 No. 2 (2019)
Publisher : SOTVI

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (274.247 KB) | DOI: 10.62527/ijasce.1.2.6

Abstract

Malay in Peninsular Malaysia can be divided into eight sub-ethnics which are Malay Bugis, Malay, Malay Champa, Malay Jawa, Malay Kelantan, Malay Kedah, Malay Minang and Malay Pattani. Ancestry informative marker (AIM) can be used to represent the eight subethnic of Malay population in Peninsular Malaysia. In this research, single nucleotide polymorphism (SNP) datasets of eight sub-ethnics are analyses in order to obtain the AIM for Malays population in Peninsular Malaysia. However, the dataset may have outlier, missing data and redundancy that may impact the accuracy of the result. Pre-processing data is an important step that will remove the entire problem. Iterative pruning principal component analysis (ipPCA) is one of the techniques that usually use in analysis on genome datasets to extract the information. It can be applied on the high structured data and can improve the resolution of the data. It also used for structure a sub-population. Random Forest and Hidden Naïve Bayes is used to classify the SNP that can be used as AIM. Information Gain Ratio will rank the chosen AIM based on the value of each attribute