The active compound is a chemical compound that has many functions. One of the functions of the active compound is as a medicine. Active compounds have special characteristics that determine function as a drug. To obtain a characteristic value on the active compound SMILES notation are used as input system. SMILES notation is a modern chemical notation that can be stored on string variables to use for the process of computing. To obtain the characteristic on the compound the SMILES notation will be divided into 12 features consisting of B, C, N, O, P, S, F, Cl, Br, I, OH and the length from SMILES notation. The value of each feature is obtained from the preprocessing process against the SMILES notation made at the beginning of the classification process.In the process of classifying the function of active compounds, the Fuzzy K-Nearest Neighbor method are used because it can do process by using large amounts of data. The Fuzzy K-Nearest Neighbor method is a combination of two methods namely Fuzzy and K-Nearest Neighbor. An important step of the classification process using the Fuzzy K-Nearest Neighbor is to calculate the distance from each test data to the train data or so-called by euclidean distance, pick value as much as k value and calculate the fuzzy. Tests in this study using the dataset as much as 631 and divided into 2 as the data train and test data. Each composition of data training and data testing are 80% (503 data) and 20% (128 data). The result of the accuracy is 71% with the value of k = 15, in other test by using k-fold cross validation the biggest accuracy is 77%.
Copyrights © 2018