The accurate prediction of atmospheric CO₂ concentrations is essential for understanding climate change dynamics and developing effective environmental policies. This study evaluates the predictive capabilities of various machine learning models, including ensemble-based regressors such as Random Forest, Gradient Boosting, and XGBoost, alongside traditional regression models such as Support Vector Regression (SVR), Ridge, and Lasso regression. The dataset, derived from meteorological observations, was preprocessed using multiple feature scaling techniques, including StandardScaler, MinMaxScaler, and RobustScaler, followed by feature engineering techniques such as polynomial transformation and Principal Component Analysis (PCA) to enhance predictive accuracy. Model performance was assessed using the coefficient of determination (R²) and cross-validation techniques. The results indicate that tree-based models, including Random Forest and XGBoost, struggled to generalize well, exhibiting negative R² values due to overfitting and an inability to capture the temporal dependencies in CO₂ variations. SVR emerged as the best-performing model, though its predictive power remained limited. Computational complexity analysis revealed that tree-based methods incurred high processing costs, while linear models such as Ridge and Lasso demonstrated lower complexity but failed to capture non-linear dependencies. The study highlights the challenges of CO₂ prediction using conventional machine learning techniques and underscores the need for advanced deep learning approaches, such as hybrid Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) models, to better capture spatial and temporal dependencies. Future research should explore integrating external environmental factors and leveraging deep learning architectures to improve predictive performance.
Copyrights © 2025