Multicollinearity is a violation of assumptions in multiple linear regression analysis that can occur if there is a high correlation between the independent variables. Likewise, the variants of multiple linear regression models such as the Geographically Weighted Regression model (GWR). Multicollinearity causes parameter estimation using the Quadratic Method (QM) unstable and produces a large variance. On the other hand, what is expected in the estimation parameters is an estimate with a minimum variance, even though it is biased. Thus, one way to overcome multicollinearity can be to use biased estimators, such as Ridge Regression (RR), Least Absolute Shrinkage and Selection Operator (LASSO), and Elastic Net (EN). In RR, the Least Square Method (LSM) coefficient is reduced to zero but it can’t select the independent variable. However, the parameter model obtained from the Ridge Regression is biased, and the variance of the resulting regression coefficients is relatively tiny. In addition, the RR is increasingly difficult to understand if a huge number of independent variables are used. Meanwhile, LASSO is a computational method that uses quadratic programming and can act out the RR principles and perform variable selection. The LASSO method became known after discovering the Least-Angle Regression (LARS) algorithm. The LASSO method can reduce the LSM coefficient to zero to perform variable selection. LASSO also has a weakness, so EN is used. In this article, the performance of the three methods is compared from the mathematical aspect. The performance of each is written as follows, RR is helpful for clustering effects, where collinear features can be selected together; LASSO is proper for feature selection when the dataset has features with poor predictive power and EN combines LASSO and RR, which has the potential to lead to simple and predictive models.
Copyrights © 2025