Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : JOURNAL OF APPLIED INFORMATICS AND COMPUTING

Stroke Risk Classification Using the Ensemble Learning Method of XGBoost and Random Forest Gullam Almuzadid; Egia Rosi Subhiyakto
Journal of Applied Informatics and Computing Vol. 9 No. 3 (2025): June 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i3.9528

Abstract

Stroke is a leading cause of global death and disability. This study proposes a stroke risk classification model using ensemble learning that combines Random Forest and XGBoost algorithms. A Kaggle dataset with 5110 samples (249 stroke, 4861 non-stroke) presented significant class imbalance. To address this, a comprehensive preprocessing pipeline was implemented, including feature encoding, feature scaling, feature selection using ANOVA F-test, outlier handling with Z-Score and IQR methods, and missing value imputation using MICE. The SMOTE-ENN approach was applied to handle class imbalance, resulting in a more balanced sample distribution. The dataset was split into 80% training and 20% testing data (hold-out test) to ensure objective evaluation. Hyperparameter optimization was performed using Bayesian optimization, while model evaluation employed stratified K-fold cross-validation to prevent overfitting. Validation on the hold-out test set demonstrated exceptional ensemble model performance with an AUC of 0.99, 98% accuracy, 98% precision, and 98% recall. Feature importance analysis identified average glucose level and age as the strongest stroke risk predictors. The proposed approach significantly improved predictive accuracy compared to previous research, demonstrating the effectiveness of ensemble learning and preprocessing methods in developing reliable, high-performing machine learning models for early stroke risk assessment.
Performance Comparison of Random Forest, SVM, and XGBoost Algorithms with SMOTE for Stunting Prediction Maulana As'an Hamid; Egia Rosi Subhiyakto
Journal of Applied Informatics and Computing Vol. 9 No. 4 (2025): August 2025
Publisher : Politeknik Negeri Batam

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.30871/jaic.v9i4.9701

Abstract

Stunting is a growth and development disorder caused by malnutrition, recurrent infections, and lack of psychosocial stimulation in which a child’s length or height is shorter than the growth standard for their age. With a prevalence of 21.5% in Indonesia by 2023, stunting is a global health problem that requires technology-based detection approaches for early intervention. This study evaluates the performance of three machine learning algorithms: Random Forest (RF), Support Vector Machine (SVM) and eXtreme Gradient Boosting (XGBoost) in predicting childhood stunting, and applying the SMOTE technique to handle data imbalance.  The evaluation results show that XGBoost with SMOTE achieves the best performance with 87.83% accuracy, 85.75% precision, 91.59% recall, and 88.57% F1-score, superior to RF (84.56%) and SVM (68.59%). These results show that the combination of XGBoost and SMOTE is the best solution for an accurate and effective machine learning-based stunting detection system, so it can be used in public health programs to accelerate stunting risk identification.