JTAM (Jurnal Teori dan Aplikasi Matematika)
Vol 10, No 3 (2026): July

Comparative Evaluation of Eigenvector Selection in Eigenvector Spatial Filtering using a Gradient Boosting Machine for PM2.5 Concentration Prediction

Putri Nisrina Az-Zahra (Statistics and Data Science, IPB University)
Anik Djuraidah (Statistics and Data Science, IPB University)
Erfiani Erfiani (Statistics and Data Science, IPB University)



Article Info

Publish Date
08 Jun 2026

Abstract

Spatial dependence remains a critical issue in spatial data analysis. To address this issue, various eigenvector selection methods within the Eigenvector Spatial Filtering (ESF) framework have been proposed. However, these methods often do not provide explicit information regarding the individual contribution of each spatial component, limiting model interpretability, particularly when dealing with a large number of candidate eigenvectors and complex models. In addition, ESF has limitations in capturing nonlinear relationships and complex interactions inherent in spatial data, while its integration with advanced feature selection methods within machine learning frameworks remains underexplored. This quantitative empirical study aims to evaluate different eigenvector selection methods within ESF integrated with a Gradient Boosting Machine (GBM) model for predicting PM2.5 concentrations in DKI Jakarta. Data were collected from 100 monitoring stations across five administrative regions for the first half of 2025. Spatial eigenvectors were derived from a spatial weights matrix and selected using four methods: positive eigenvalues, Moran’s Index significance, LASSO regression, and SHAP values obtained from the GBM model. Model performance was assessed using both 10-fold random cross-validation and spatial blocked cross-validation to evaluate predictive accuracy and spatial generalization. The results showed that adding spatial eigenvectors significantly improved the model performance compared to models without spatial components. Under 10-fold cross-validation, the SHAP-based selection method achieved the highest predictive accuracy (R² = 0.619), effectively capturing spatial dependence and nonlinear relationships. The SHAP method demonstrated robustness by selecting stable and consistent spatial components across different regions. These findings highlight the methodological advantage of integrating ESF with machine learning and SHAP-based feature selection, offering a more interpretable and robust framework for spatial modelling. Practically, the improved prediction of PM2.5 concentrations can support more accurate air quality assessments and inform environmental management strategies in urban areas.

Copyrights © 2026






Journal Info

Abbrev

jtam

Publisher

Subject

Mathematics

Description

Jurnal Teori dan Aplikasi Matematika (JTAM) dikelola oleh Program Studi Pendidikan Matematika FKIP Universitas Muhammadiyah Mataram dengan ISSN (Cetak) 2597-7512 dan ISSN (Online) 2614-1175. Tim Redaksi menerima hasil penelitian, pemikiran, dan kajian tentang (1) Pengembangan metode atau model ...