Sustainable development and climate change are central agendas in global policy and research. This study examines and compares three ensemble learning models using Gradient Boosting Machine, Categorical Boosting, and Extreme Gradient Boosting for forecasting vehicle carbon dioxide (CO2) emissions. Data preprocessing with Interquartile Range (IQR) and median imputation is among the methods used to address missing values in CO₂ rating and smog rating variables. SHAP and PDP were employed for feature importance analysis and model interpretability. The findings from the third experiment demonstrate that Extreme Gradient Boosting (XGBoost) outperformed other models achieving a Coefficient Determination of 0.9988, Root-Mean-Square Error of 2.1696, Mean-Absolute Error of 0.4977, and Mean-Absolute-Percentage Error of 0.0019. The primary predictive features included combined fuel consumption (liters/100 km), city and highway fuel consumption, ethanol fuel consumption, model year, engine size and diesel consumption. The findings suggest the potential of boosting-based models for supporting sustainable transport planning, policy for emission reduction, and evidence-based policy making.
Copyrights © 2025