Journal of Applied Data Sciences
Vol 7, No 1: January 2026

Feature Engineering for Tropical Rainfall Forecasting Using Random Forest and Support Vector Regression

Slamet, Cepy (Unknown)
Imron, Rizka M (Unknown)
Wahana, Agung (Unknown)
Maylawati, Dian Sa'adillah (Unknown)
Zulfikar, Wildan Budiawan (Unknown)
Ramdhani, Muhammad Ali (Unknown)



Article Info

Publish Date
31 Jan 2026

Abstract

The complex dynamics of weather variability in Indonesia, influenced by multiple climatic drivers, make rainfall forecasting in tropical regions a significant scientific challenge. This study proposes an automated feature engineering pipeline to enhance the performance of Random Forest Regression (RFR) and Support Vector Regression (SVR) models for tropical rainfall prediction. Monthly rainfall data spanning 388 months (1993–2025) from a BMKG station were used as the basis for model development. The pipeline systematically generates temporal, seasonal, statistical, and anomaly-based features to provide domain-informed representations for non-sequential learning algorithms. Model performance was evaluated under four temporal data partitioning scenarios using R², RMSE, and probabilistic confidence intervals derived from bootstrap residual simulations. Results indicate that RFR achieved the highest predictive accuracy (R² = 0.93; RMSE = 31.01 mm) and demonstrated superior temporal–seasonal stability (Rolling CV: R² = 0.81 ± 0.07; RMSE = 55.44 ± 16.18), with comparable performance between wet and dry seasons. Conversely, SVR showed greater sensitivity to seasonal variability, with R² dropping to 0.55 during the wet season, indicating higher uncertainty under extreme rainfall conditions. Robustness and drift analyses further revealed that RFR adapts better to temporal and seasonal shifts, while SVR remains relevant as an adaptive model for extreme risk analysis. Overall, this study contributes to the development of automated feature engineering, reproducible climatological forecasting pipelines, and probabilistic modeling frameworks for rainfall prediction under uncertainty in tropical regions.

Copyrights © 2026






Journal Info

Abbrev

JADS

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Description

One of the current hot topics in science is data: how can datasets be used in scientific and scholarly research in a more reliable, citable and accountable way? Data is of paramount importance to scientific progress, yet most research data remains private. Enhancing the transparency of the processes ...