Journal of Applied Data Sciences
Vol 6, No 4: December 2025

Optimizing The XGBoost Model with Grid Search Hyperparameter Tuning for Maximum Temperature Forecasting

Sugiarto, Sugiarto (Unknown)
Mas Diyasa, I Gede Susrama (Unknown)
Alhamda, Denisa Septalian (Unknown)
Aryananda, Rangga Laksana (Unknown)
Fatmah Sari, Allan Ruhui (Unknown)
Sukri, Hanifudin (Unknown)
Dewi, Deshinta Arrowa (Unknown)



Article Info

Publish Date
13 Sep 2025

Abstract

This study presents a novel comparative approach to maximum temperature forecasting in Surabaya, Indonesia, by integrating Extreme Gradient Boosting (XGBoost) with Grid Search Hyperparameter Tuning and benchmarking it against Autoregressive Integrated Moving Average (ARIMA) and Neural Prophet models. The main idea is to evaluate the capability of XGBoost in capturing nonlinear patterns in environmental time series data, which traditional models often fail to address. Using 15,388 historical daily maximum temperature records from the BMKG Juanda weather station spanning 1981–2022, the objective is to identify the most accurate predictive model for short- and medium-term forecasts. The modeling process involved four stages: data acquisition, preprocessing, training, and evaluation, with performance assessed using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The findings show that, after hyperparameter tuning, XGBoost achieved the best performance with MAE = 0.32 and RMSE = 0.65, outperforming ARIMA (MAE = 0.85, RMSE = 1.20) and Neural Prophet (MAE = 0.70, RMSE = 0.98). Prediction results for 2025 indicate peak maximum temperatures in January, October, and November, aligning with recent climate patterns. The contribution of this research lies in demonstrating the superiority of a tuned XGBoost model for complex environmental datasets, offering a practical tool for urban climate planning, agricultural scheduling, and heatwave risk mitigation. The novelty of this work is the systematic integration of Grid Search-based optimization with XGBoost for meteorological forecasting in a tropical urban context, producing higher accuracy than both classical statistical and modern hybrid time series methods. These results highlight the model’s adaptability and potential for broader climate-related applications, with future research recommended to incorporate additional meteorological variables such as humidity and wind speed for even greater predictive capability.

Copyrights © 2025






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...