Claim Missing Document
Check
Articles

Found 3 Documents
Search

Metode Robust K-Fold Cross Validation dengan Partial Least Square Regression pada Data Near Infrared Spectroscopy Sibuea, Nuraini; Syamsudhuha, Syamsudhuha; Adnan, Arisman
Seminar Nasional Teknologi Informasi Komunikasi dan Industri 2024: SNTIKI 16
Publisher : UIN Sultan Syarif Kasim Riau

Show Abstract | Download Original | Original Source | Check in Google Scholar

Abstract

Penelitian ini mengevaluasi performa model Partial Least Square Regression (PLSR) dalam kondisi data dengan dan tanpa outlier. Penanganan data yang mengandung outlier digunakan metode k-fold cross validation yang diaplikasikan pada data Near Infrared Spectroscopy (NIRS) tanah perkebunan kelapa sawit terhadap pupuk nitrogen (N). Sebelum pengolahan data dilakukan terlebih dahulu pretreatment data untuk menghilangkan efek hamburan data dengan Standardized Normal Variate (SNV). Identifikasi outlier dilakukan dengan metode RBF Kernel PCA menghasilkan data yang termasuk outlier yaitu data ke 7, 8, 92, 93, dan 95. Hasil analisis menunjukkan bahwa keberadaan outlier secara signifikan menurunkan performa PLSR klasik dengan penurunan nilai R2 dan peningkatan nilai RMSE. Penerapan k-fold cross validation pada PLSR mampu meningkatkan robustitas model terhadap outlier dengan peningkatan nilai R2 meskipun sedikit peningkatan pada RMSE. Disimpulkan bahwa k-fold cross validation lebih efektif dalam menangani data set yang mengandung outlier sehingga memberikan prediktabilitas yang lebih stabil dibandingkan PLSR klasik.
Robust Method with Cross-Validation in Partial Least Square Regression Sibuea, Nuraini; Syamsudhuha, Syamsudhuha; Adnan, Arisman; Silalahi, Divo Dharma
Journal of Mathematics, Computations and Statistics Vol. 8 No. 1 (2025): Volume 08 Nomor 01 (April 2025)
Publisher : Jurusan Matematika FMIPA UNM

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35580/jmathcos.v8i1.4766

Abstract

Partial Least Squares Regression (PLSR) is a multivariate analysis technique used to handle data with highly correlated predictor variables or when the number of predictor variables exceeds the number of samples. PLSR is not robust to outliers, which can disrupt the stability and accuracy of the model. Cross-validation is an important approach to improve model reliability, particularly in data that contains outliers. This study aims to evaluate the effectiveness of K-fold cross-validation and nested cross-validation in a PLSR model using NIRS data from oil palm plantation soil that contains outliers. The methods used in this study include outlier identification using RBF kernel PCA, followed by the application of K-fold cross-validation and nested cross-validation in the PLSR model. The evaluation is based on the Root Mean Square Error (RMSE) and the Coefficient of Determination (R²). The results show that nested cross-validation performs better than K-fold cross-validation. Nested cross-validation results in lower RMSE and higher R², both with and without outliers. K-fold cross-validation is more susceptible to overfitting, whereas nested cross-validation is more effective in mitigating the impact of outliers and improving model accuracy. The conclusion of this study is that nested cross-validation outperforms K-fold cross-validation in improving prediction accuracy and the stability of the PLSR model, especially in data containing outliers. It is recommended to use nested cross-
Application of The RFM Model and K-Means Clustering for Customer Segmentation in E-Wallet Top-Up Services Sundari, Agus; Putra, Indra Syah; Sibuea, Nuraini
INFOMATEK Vol 28 No 1 (2026): Juni 2026 (In Progress)
Publisher : Fakultas Teknik, Universitas Pasundan

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.23969/infomatek.v28i1.42246

Abstract

The implementation of digital payment technology through e-wallet top-up services requires financial institutions to understand user characteristics and behavior comprehensively The objective of this study is to segment customers based on their e-wallet top-up behavior by analyzing 143,836 bill payment transaction records using the RFM (Recency, Frequency, Monetary) model combined with the K-Means clustering algorithm. The dataset contains more than one hundred thousand transaction entries, with RFM parameters representing the time since the last transaction, the frequency of top-ups, and the monetary value spent by users. The RFM scoring process is applied to quantify user activity levels before entering the clustering stage. The K-Means clustering model successfully grouped customers into three distinct segments. The first segment represents low-activity users, the second consists of moderately active customers with stable transaction behavior, while the third segment captures highly engaged users with the highest transaction frequency and value. Evaluation metrics, including a silhouette score of 0.64, a Calinski-Harabasz index of 21690.50, and a Davies-Bouldin score of 0.70, demonstrate strong clustering performance and reliable separation between groups. The findings provide valuable insights for designing service strategies, improving mobile banking system performance, and developing targeted marketing approaches tailored to each customer segment. This research highlights the potential of RFM based clustering as a decision-support tool for enhancing digital payment service optimization and customer engagement.