Majalah Ilmiah Matematika dan Statistika (MIMS)
Vol. 24 No. 1 (2024): Majalah Ilmiah Matematika dan Statistika

Regularisasi model pembelajaran mesin dengan regresi terpenalti pada data yang mengandung multikolinearitas (Studi kasus prediksi Indeks Pembangunan Manusia di 34 provinsi di Indonesia)

Khamidah, Nur (Unknown)
Sadik, Kusman (Unknown)
M Soleh, Agus (Unknown)
Dito, Gerry Alfa (Unknown)



Article Info

Publish Date
14 Mar 2024

Abstract

This research intends to model high-dimensional data that contains multicollinearity in four machine-learning algorithms: Random Forest, K-Nearest Neighbor, XGBoost, and Regression Tree. Previously, regularization was carried out with penalized ridge regression, least absolute shrinkage and selection operator (LASSO) regression, and Elastic Net regression. A total of 100 predictor variables and 1 response variable which are the Development Index 2022 data of 34 provinces in Indonesia from BPS were used and standardized. The simulation is also applied to highly correlated data on two distributions, uniform and normal with parameter values taken from existing empirical data. The results showed that the ridge regularization method is the best for producing accurate and stable predictions. Furthermore, there was no difference in the root mean square error (RMSE) results between the data with standardization and without standardization, wherein all the data analyzed it was found that the kNN model was better than other models on simulation data, and the Random Forest and XGBoost models were better than other models on empirical data. In addition, the Regression Tree model is not recommended according to the results of this study. Keywords: regularization, multicollinearity, ridge, LASSO, elastic netMSC2020: 62J07

Copyrights © 2024






Journal Info

Abbrev

MIMS

Publisher

Subject

Mathematics

Description

The aim of this publication is to disseminate the conceptual thoughts or ideas and research results that have been achieved in the area of mathematics and statistics. MIMS, focuses on the development areas sciences of mathematics and statistics as follows: 1. Algebra and Geometry; 2. Analysis and ...