Compounds are things that are often found in this world, with a substance that is a collection of compounds (Educated, 2015). The compound itself is divided into active and inactive compounds. The compound has a function that may be utilized for some aspect if it has a function like a drug or a stimulating hormone work. notation of SMILES (Simplified Molecular Input Line System) by David Weininger in 1980. SMILES notation takes advantage of ASCII characters that are very easy to process by the computer. SMILES notation classification process will be very useful to know the function class of the compound. This study was conducted to classify the function of the compound utilizing the SMILES notation by applying the C4.5 algorithm while the object is 2 classes of compound function, including the class of cancer and metabolism. Features tested from research as many as 11 features. The results of the best tests when the discretization technique is performed using entropy based discretization techniques, dividing the SMILES notation values ​​on each feature attribute, and the use of practicable data as much as possible will result in an accuracy of 79.34%. While the accuracy of the cross validation test shows an accuracy of 70.18%.
Copyrights © 2019