Weather prediction in tropical Indonesia faces complex challenges due to high climate variability, persistent El Niño–Southern Oscillation (ENSO) influence, and uneven observational coverage. This study compared five machine learning algorithms — Random Forest (RF), Support Vector Machine (SVM), Long Short-Term Memory (LSTM), XGBoost, and LightGBM — using 187,320 daily records from BMKG stations, ERA5 reanalysis, and TRMM satellite data (2000–2023). Preprocessing included MMDIF-RF imputation, Z-score normalization, and SMOTE for class imbalance correction. Models were evaluated on RMSE, MAE, R², Accuracy, Precision, Recall, F1-Score, and AUC-ROC. LSTM achieved the best performance (RMSE = 3.94 mm; R² = 0.891; F1-Score = 0.887; AUC-ROC = 0.941), reflecting its capacity to capture long-range temporal dependencies. XGBoost and LightGBM delivered competitive accuracy at 8–18 times lower training cost, while SVM recorded the lowest accuracy with the highest computational demand. Regional analysis showed station density and data completeness were more consequential than algorithm choice — LSTM RMSE ranged from 3.61 mm in West Java to 5.43 mm in East Nusa Tenggara. A tiered hybrid approach is recommended: LightGBM or XGBoost for routine forecasting and LSTM for extreme event detection, alongside expanded BMKG coverage in eastern Indonesia.
Copyrights © 2026