Anik Djuraidah
Statistics and Data Science, IPB University

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparative Evaluation of Eigenvector Selection in Eigenvector Spatial Filtering using a Gradient Boosting Machine for PM2.5 Concentration Prediction Putri Nisrina Az-Zahra; Anik Djuraidah; Erfiani Erfiani
JTAM (Jurnal Teori dan Aplikasi Matematika) Vol 10, No 3 (2026): July
Publisher : Universitas Muhammadiyah Mataram

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.31764/jtam.v10i3.38883

Abstract

Spatial dependence remains a critical issue in spatial data analysis. To address this issue, various eigenvector selection methods within the Eigenvector Spatial Filtering (ESF) framework have been proposed. However, these methods often do not provide explicit information regarding the individual contribution of each spatial component, limiting model interpretability, particularly when dealing with a large number of candidate eigenvectors and complex models. In addition, ESF has limitations in capturing nonlinear relationships and complex interactions inherent in spatial data, while its integration with advanced feature selection methods within machine learning frameworks remains underexplored. This quantitative empirical study aims to evaluate different eigenvector selection methods within ESF integrated with a Gradient Boosting Machine (GBM) model for predicting PM2.5 concentrations in DKI Jakarta. Data were collected from 100 monitoring stations across five administrative regions for the first half of 2025. Spatial eigenvectors were derived from a spatial weights matrix and selected using four methods: positive eigenvalues, Moran’s Index significance, LASSO regression, and SHAP values obtained from the GBM model. Model performance was assessed using both 10-fold random cross-validation and spatial blocked cross-validation to evaluate predictive accuracy and spatial generalization. The results showed that adding spatial eigenvectors significantly improved the model performance compared to models without spatial components. Under 10-fold cross-validation, the SHAP-based selection method achieved the highest predictive accuracy (R² = 0.619), effectively capturing spatial dependence and nonlinear relationships. The SHAP method demonstrated robustness by selecting stable and consistent spatial components across different regions. These findings highlight the methodological advantage of integrating ESF with machine learning and SHAP-based feature selection, offering a more interpretable and robust framework for spatial modelling. Practically, the improved prediction of PM2.5 concentrations can support more accurate air quality assessments and inform environmental management strategies in urban areas.