Thomas, Binu
Unknown Affiliation

Published : 1 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : EMITTER International Journal of Engineering Technology

Performance Analysis of Decision Tree Ensemble Models and Feature Importance Analysis in Prediction of Particulate Matter PM10 Babu, Sherin; Thomas, Binu
EMITTER International Journal of Engineering Technology Vol 13 No 2 (2025)
Publisher : Politeknik Elektronika Negeri Surabaya (PENS)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.24003/emitter.v13i2.933

Abstract

Particulate Matter induced air pollution is known to have significant negative impacts on both the environment and human health. This research evaluates the effectiveness of various decision tree ensemble models in predicting daily PM10 concentrations in Thiruvananthapuram, Kerala, from July 2017 to December 2019. Seven decision tree ensemble models, namely Random Forest, Extra Trees, Gradient Boosting, AdaBoost, LightGBM, XGBoost, and Histogram-Based Gradient Boosting are employed here. To address missing data in the dataset, kNN imputation is utilized for a cohesive dataset suitable for model training. The models utilize both meteorological and air pollutant variables, with performance assessment using metrics such as the coefficient of determination (R²), root mean square error (RMSE) and mean absolute error (MAE). The findings indicate that the Extra Trees regression model provided the best prediction performance (R² = 0.9397, RMSE = 6.664 μg/m³, MAE = 4.950 μg/m³). Histogram-Based Gradient Boosting and Random Forest also demonstrate strong predictive capabilities. The explainability of the best prediction models is conducted by the feature importance analysis process. Feature importance analysis highlighted sulfur dioxide (SO2) as the most significant pollutant influencing PM10 levels, alongside meteorological factors like wind speed and rainfall, enhancing both prediction accuracy and interpretability of results. This research represents the first comprehensive effort to predict PM10 levels in Thiruvananthapuram using machine learning techniques, addressing a gap in regional air quality studies.