This Author published in this journals
All Journal Vertex
Claim Missing Document
Check
Articles

Found 1 Documents
Search

Obesity risk estimation using ensemble learning and synthetic data augmentation techniques Ujianto, Nur Tulus; Gunawan, Gunawan; Andriani, Wresti; Ramadhani, Ivan Rizky; Nasichatun, Nasichatun
Vertex Vol. 14 No. 2 (2025): June: Computer Science
Publisher : Institute of Computer Science (IOCS)

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35335/1bg4ws75

Abstract

Obesity has become a primary global health concern due to its strong association with various chronic diseases such as diabetes, cardiovascular disorders, and certain types of cancer. Accurate and early risk prediction of obesity is essential for effective prevention and intervention strategies. However, predictive modeling in this domain often encounters two critical challenges: the presence of imbalanced datasets and the complex, nonlinear nature of behavioral and anthropometric features. This study aims to address these challenges by developing a robust classification model that integrates ensemble learning with synthetic data augmentation techniques. The research utilizes the Obesity Dataset from Kaggle, which comprises 2,111 records labeled into seven obesity levels, reflecting a realistic class distribution imbalance. Preprocessing steps included data cleaning, encoding, and stratified splitting. To enhance class representation, two augmentation methods were applied: SMOTE for synthetic oversampling and Generative Adversarial Networks (GANs) for generating realistic minority samples. A stacking ensemble model was constructed using Random Forest and XGBoost as base learners, with Logistic Regression serving as the meta-learner. Hyperparameter optimization was conducted using both grid and randomized search methods. Evaluation metrics, including accuracy, precision, recall, and F1-score, were used to assess performance. The proposed model achieved a 91% accuracy and an F1-score of 0.89, significantly outperforming models from previous studies. These findings suggest that combining ensemble learning with hybrid augmentation strategies effectively addresses class imbalance and improves predictive reliability in obesity risk estimation. The developed model holds practical value as a decision-support tool for early screening and targeted intervention in obesity prevention programs.