Shehab, Eman
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Character N-gram model for toxicity prediction Shehab, Eman; Nayel, Hamada; Taha, Mohamed
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 13, No 4: December 2024
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v13.i4.pp4380-4387

Abstract

Molecular toxicity prediction is a crucial step in the drug discovery process. It has a direct relationship with human health and medical destiny. Accurately assessing a molecule’s toxicity can aid in the weeding out of low-quality compounds early in the drug discovery phase, avoiding depletion later in the drug development process. Computational models have been used automatically for molecular toxicity prediction. In this paper, a machine learning-based model has been proposed. TF/IDF representation scheme has been used for N-gram and integrated with simplified molecular-input line-entry system (SMILES). Multiple machine learning classifiers such as logistic regression (LR), support vector machine (SVM), random forest (RF), decision tree (DT), k-nearest neighbors (KNN), AdaBoost, multi-layer perceptron (MLP), and stochastic gradient descent (SGD) classifiers have been implemented. A wide range of N-gram models have been implemented and trigram reported the best results. RF and SVM achieved 85% and 84% accuracy respectively. Comparable to state-of-the-art models, our results are acceptable as we used minimum available resources.