Narra X
Vol. 4 No. 1 (2026): April 2026 (In Press)

Random forest-based QSAR modeling for predicting the potency of neprilysin inhibitors using Mordred molecular descriptors

Albar, Nizam (Unknown)
Rampengan, Derren DCH. (Unknown)
Azhari, Saiful (Unknown)
Mahmudi, Mahmudi (Unknown)
Fahdhienie, Farrah (Unknown)
Susilawati, Anggi (Unknown)
Habiburrahman, Muhammad (Unknown)



Article Info

Publish Date
01 Apr 2026

Abstract

Neprilysin (NEP) is a zinc-dependent metallopeptidase, considered a key therapeutic target in heart failure management. Efficient identification of potent NEP inhibitors remains a challenge in drug discovery. The aim of this study was to develop a quantitative structure–activity relationship (QSAR) model using 2D Mordred molecular descriptors and Random Forest algorithms to predict the inhibitory potency (pIC50) of drug candidates. A curated dataset of compounds with experimentally determined IC₅₀ values (in nM) against NEP was preprocessed and converted to pIC50. Mordred was used to calculate 2D molecular descriptors, and descriptors with missing values were excluded. The dataset was split into training, internal validation, and external test sets. A Random Forest regression model was trained using 500 estimators, and model performance was evaluated using R2, root mean square error (RMSE), mean absolute error (MAE), and concordance correlation coefficient (CCC), while a binary classification model was also constructed. Feature importance, residual analysis, and chemical space visualization were conducted to assess model interpretability and reliability. The regression model demonstrated moderate to strong predictive performance, with R2 of 0.286, RMSE of 0.949, MAE of 0.723, and CCC of 0.532 in the internal validation. External validation showed improved generalization, with R2=0.659, RMSE=0.858, MAE=0.630, and CCC=0.763. Binary classification revealed an accuracy of 0.953, precision of 1.000, recall of 0.943, and an F1-score of 0.971, indicating strong discriminative ability in classifying inhibitory versus non-inhibitory compounds. Top contributing descriptors included ATSC2p (feature importance=0.0505), GATS2p (0.0408), and SaasC (0.0317). Principal component analysis (PCA) and Williams plots confirmed that test compounds lie within the model’s applicability domain, with no major outliers in leverage or residual distribution. The developed Random Forest-based QSAR model demonstrates strong predictive power and interpretability for identifying NEP inhibitors. This study provides a valuable tool for virtual screening and highlights the relevance of 2D structural features in governing NEP inhibitory activity. It is the first dedicated QSAR analysis of neprilysin inhibition using Mordred descriptors with rigorous internal and external validation.

Copyrights © 2026






Journal Info

Abbrev

main

Publisher

Subject

Biochemistry, Genetics & Molecular Biology Chemistry Immunology & microbiology Materials Science & Nanotechnology Medicine & Pharmacology Physics Public Health

Description

Narra X is a multidisciplinary journal, published three times in a year (April, August, and December). The journal aims to act as a platform for rapid scientific communication while upholding the highest integrity. Articles are published in a form of Original articles, Short Report, Case Reports, ...