Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : heca journal of applied sciences

Comparative Analysis of Ensemble Machine Learning Models for QSAR-Based Prediction of Anticoagulant Activity in Thrombotic Disorders Noviandy, Teuku Rizky; Sufri, Rahmat; Setiawan, Ryan; Anisah, Anisah
Heca Journal of Applied Sciences Vol. 4 No. 1 (2026): March 2026
Publisher : Heca Sentra Analitika

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.60084/hjas.v4i1.393

Abstract

Thrombotic disorders remain a major cause of global morbidity and mortality, with dysregulation of blood coagulation pathways playing a central role in disease progression. In particular, Thrombin is a key therapeutic target for anticoagulant drug development, making accurate prediction of inhibitory activity highly relevant for accelerating discovery efforts. Despite advances in computational drug discovery, there is still a need for systematic evaluation of machine learning approaches for QSAR-based prediction of anticoagulant activity. Many existing studies focus on single models or lack consistent comparison frameworks, limiting insights into the relative performance of different ensemble techniques. To address this gap, this study explores the application of multiple ensemble machine learning methods, including Random Forest, XGBoost, Gradient Boosting, and Extra Trees, combined with hyperparameter optimization using random search. The main objective of this work is to conduct a comparative analysis of these ensemble models to predict pIC50 values for thrombin inhibitors using molecular descriptors derived from chemical structures. The results show that the Extra Trees model achieved the best overall performance, with an R2 of 0.697, RMSE of 0.851, and MAE of 0.615 after tuning. Additionally, Gradient Boosting and XGBoost demonstrated significant improvement following hyperparameter optimization, highlighting the importance of model tuning in QSAR tasks. Overall, the study confirms that ensemble learning methods yield reliable, accurate predictions of anticoagulant activity, with Extra Trees emerging as the most effective approach for this dataset.