Claim Missing Document
Check
Articles

Found 1 Documents
Search

Variable Selection in Kernel Ridge Regression based on Sparrow Search Algorithm with Application QSAR Modeling Al-Shabaki, Zainab Modhfer Ali; Algamal, Zakariya Yahya
Journal of Multidisciplinary Applied Natural Science Articles in Press
Publisher : Pandawa Institute

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47352/jmans.2774-3047.360

Abstract

Variable selection plays a critical role in enhancing the predictive accuracy, interpretability, and computational efficiency of kernel ridge regression (KRR) models, especially when applied to high-dimensional datasets such as those used in quantitative structure-activity relationship (QSAR) modeling. This study investigates improved binary sparrow bird search algorithm (BSSA) variants incorporating different transfer functions for variable selection in KRR. The performance of these variants was extensively evaluated on seven benchmark biopharmaceutical datasets with thousands of molecular descriptors, comparing their prediction accuracy, variable subset compactness, and computational cost against baseline KRR without variable selection. Results demonstrate that all BSSA variants significantly outperform KRR in terms of mean squared error (MSE) and coefficient of determination. The quadratic-BSSA (Q-BSSA) variant consistently achieved the best predictive performance, reducing MSE by up to 30% and increasing the coefficient of determination to values above 0.95 on several datasets while selecting the fewest variables, reflecting effective and parsimonious variable selection. Furthermore, BSSA variants substantially decreased the computational time required for model training compared to KRR, with Q-BSSA exhibiting the lowest runtime across datasets. Statistical validation using the Wilcoxon signed-rank test confirmed that all BSSA variants provided statistically significant improvements over KRR. The findings highlight the efficacy of sophisticated binary metaheuristic algorithms for variable selection in kernel-based models, underscoring their potential in computational chemistry and related domains where high-dimensionality and nonlinear interactions complicate predictive modeling.