Raden Rizky Widdie Tigusti
Fakultas Ilmu Komputer, Universitas Brawijaya

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Implementasi Fuzzy K-Nearest Neighbor (FK-NN) Untuk Mengklasifikasi Fungsi Senyawa Berdasarkan Simplified Molecular Input Line Entry System (SMILES) Raden Rizky Widdie Tigusti; Dian Eka Ratnawati; Syaiful Anam
Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer Vol 2 No 12 (2018): Desember 2018
Publisher : Fakultas Ilmu Komputer (FILKOM), Universitas Brawijaya

Show Abstract | Download Original | Original Source | Check in Google Scholar | Full PDF (297.598 KB)

Abstract

The active compound is a chemical compound that has many functions. One of the functions of the active compound is as a medicine. Active compounds have special characteristics that determine function as a drug. To obtain a characteristic value on the active compound SMILES notation are used as input system. SMILES notation is a modern chemical notation that can be stored on string variables to use for the process of computing. To obtain the characteristic on the compound the SMILES notation will be divided into 12 features consisting of B, C, N, O, P, S, F, Cl, Br, I, OH and the length from SMILES notation. The value of each feature is obtained from the preprocessing process against the SMILES notation made at the beginning of the classification process.In the process of classifying the function of active compounds, the Fuzzy K-Nearest Neighbor method are used because it can do process by using large amounts of data. The Fuzzy K-Nearest Neighbor method is a combination of two methods namely Fuzzy and K-Nearest Neighbor. An important step of the classification process using the Fuzzy K-Nearest Neighbor is to calculate the distance from each test data to the train data or so-called by euclidean distance, pick value as much as k value and calculate the fuzzy. Tests in this study using the dataset as much as 631 and divided into 2 as the data train and test data. Each composition of data training and data testing are 80% (503 data) and 20% (128 data). The result of the accuracy is 71% with the value of k = 15, in other test by using k-fold cross validation the biggest accuracy is 77%.