Obesity is a complex public health issue that requires effective early identification strategies to mitigate its long-term health impacts. This study aimed to classify obesity levels, categorized as Underweight, Normal, Overweight, and Obese, using 14 predictors grouped into three domains: biological, dietary, and physical activity. In addition to developing an accurate predictive model, the study investigated which domain contributes most to obesity classification. Two complementary modeling strategies were applied: domain-specific decision tree models to evaluate the predictive strength of each domain independently, and a comprehensive model using eXtreme Gradient Boosting (XGBoost) trained on all predictors. To address class imbalance, SMOTENC oversampling was applied to the training set, and hyperparameter tuning was performed via cross-validation for both approaches. Evaluation on the test set showed that the XGBoost model outperformed the domain-based decision trees across all performance metrics, including balanced accuracy, precision, recall, specificity, and F1-score. While decision trees offered domain-level interpretability, they lacked the predictive power of the integrated model. SHAP (SHapley Additive Explanations) analysis revealed that influential features spanned all domains, with Age, Vegetable consumption, and Transportation type emerging as top predictors. These findings demonstrate that integrating multi-domain behavioral data enhances both the accuracy and interpretability of obesity classification models, supporting the use of interpretable machine learning for personalized health risk assessment and prevention strategies
Copyrights © 2026