Vertex
Vol. 14 No. 2 (2025): June: Computer Science

Obesity risk estimation using ensemble learning and synthetic data augmentation techniques

Ujianto, Nur Tulus (Unknown)
Gunawan, Gunawan (Unknown)
Andriani, Wresti (Unknown)
Ramadhani, Ivan Rizky (Unknown)
Nasichatun, Nasichatun (Unknown)



Article Info

Publish Date
30 Jun 2025

Abstract

Obesity has become a primary global health concern due to its strong association with various chronic diseases such as diabetes, cardiovascular disorders, and certain types of cancer. Accurate and early risk prediction of obesity is essential for effective prevention and intervention strategies. However, predictive modeling in this domain often encounters two critical challenges: the presence of imbalanced datasets and the complex, nonlinear nature of behavioral and anthropometric features. This study aims to address these challenges by developing a robust classification model that integrates ensemble learning with synthetic data augmentation techniques. The research utilizes the Obesity Dataset from Kaggle, which comprises 2,111 records labeled into seven obesity levels, reflecting a realistic class distribution imbalance. Preprocessing steps included data cleaning, encoding, and stratified splitting. To enhance class representation, two augmentation methods were applied: SMOTE for synthetic oversampling and Generative Adversarial Networks (GANs) for generating realistic minority samples. A stacking ensemble model was constructed using Random Forest and XGBoost as base learners, with Logistic Regression serving as the meta-learner. Hyperparameter optimization was conducted using both grid and randomized search methods. Evaluation metrics, including accuracy, precision, recall, and F1-score, were used to assess performance. The proposed model achieved a 91% accuracy and an F1-score of 0.89, significantly outperforming models from previous studies. These findings suggest that combining ensemble learning with hybrid augmentation strategies effectively addresses class imbalance and improves predictive reliability in obesity risk estimation. The developed model holds practical value as a decision-support tool for early screening and targeted intervention in obesity prevention programs.

Copyrights © 2025






Journal Info

Abbrev

Vertex

Publisher

Subject

Aerospace Engineering Automotive Engineering Chemical Engineering, Chemistry & Bioengineering Civil Engineering, Building, Construction & Architecture Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering

Description

Articles published in Vertex include original scientific research results (top priority), new scientific review articles (non-priority), or comments or criticisms on scientific papers published by Vertex. The journal accepts manuscripts or articles in the field of engineering from various academics ...