Journal of Mathematics, Computation and Statistics (JMATHCOS)
Vol. 8 No. 1 (2025): Volume 08 Nomor 01 (April 2025)

Robust Method with Cross-Validation in Partial Least Square Regression

Sibuea, Nuraini (Unknown)
Syamsudhuha, Syamsudhuha (Unknown)
Adnan, Arisman (Unknown)
Silalahi, Divo Dharma (Unknown)



Article Info

Publish Date
24 Mar 2025

Abstract

Partial Least Squares Regression (PLSR) is a multivariate analysis technique used to handle data with highly correlated predictor variables or when the number of predictor variables exceeds the number of samples. PLSR is not robust to outliers, which can disrupt the stability and accuracy of the model. Cross-validation is an important approach to improve model reliability, particularly in data that contains outliers. This study aims to evaluate the effectiveness of K-fold cross-validation and nested cross-validation in a PLSR model using NIRS data from oil palm plantation soil that contains outliers. The methods used in this study include outlier identification using RBF kernel PCA, followed by the application of K-fold cross-validation and nested cross-validation in the PLSR model. The evaluation is based on the Root Mean Square Error (RMSE) and the Coefficient of Determination (R²). The results show that nested cross-validation performs better than K-fold cross-validation. Nested cross-validation results in lower RMSE and higher R², both with and without outliers. K-fold cross-validation is more susceptible to overfitting, whereas nested cross-validation is more effective in mitigating the impact of outliers and improving model accuracy. The conclusion of this study is that nested cross-validation outperforms K-fold cross-validation in improving prediction accuracy and the stability of the PLSR model, especially in data containing outliers. It is recommended to use nested cross-

Copyrights © 2025






Journal Info

Abbrev

JMATHCOS

Publisher

Subject

Mathematics

Description

Fokus yang didasarkan tidak hanya untuk penelitian dan juga teori-teori pengetahuan yang tidak menerbitkan plagiarism. Ruang lingkup jurnal ini adalah teori matematika, matematika terapan, program perhitungan, perhitungan matematika, statistik, dan statistik ...