Claim Missing Document
Check
Articles

Found 1 Documents
Search
Journal : Scientific Journal of Informatics

Performance Analysis of Machine Learning Models using RFE Feature Selection and Bayesian Optimization in Imbalanced Data Classification with Shap-Based Explanations Aqmar, Nurzatil; Wijayanto, Hari; Mochamad Afendi, Farit
Scientific Journal of Informatics Vol. 12 No. 3: August 2025
Publisher : Universitas Negeri Semarang

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.15294/sji.v12i3.31459

Abstract

Purpose: This research aims to evaluates the performance of Random Forest (RF) and Light Gradient Boosting Machine (LightGBM) models integrated with Recursive Feature Elimination (RFE) for feature selection, Bayesian Optimization (BO) for hyperparameter tuning, and three imbalanced data handling techniques Random Undersampling (RUS), Random Oversampling (ROS), and SMOTENC. Identifying key determinants of household food insecurity in Papua using SHAP for transparent feature interpretation. Methods: The research used 2022 SUSENAS data from Papua Province. Exploring data composition and variable characteristics, and aggregating individual data into household data. Data were split using random sampling (80% training, 20% testing). Eighteen experimental scenarios were created by combining feature selection or no feature selection, three imbalance handling methods, and default or hyperparameter tuning. RF and LightGBM were evaluated over 50 iterations using accuracy, sensitivity, specificity, and G-Mean, with SHAP applied to the best-performing models for interpretability. Result: LightGBM achieved the highest accuracy and stability, particularly when combined with SMOTENC and RFE+BO. RF showed better performance in maintaining G-Mean when paired with RUS, with the highest G-Mean (0.756) obtained by RF + BO + RUS. Three-way ANOVA proved that model type, imbalance handling, feature selection, and their interaction significantly affected the G-Mean value. SHAP analysis shows that health, financial, and educational limitations can increase the risk of food insecurity. Novelty: This research offers a new integration between feature selection, hyperparameter tuning, and imbalanced data handling within an interpretable machine learning framework, thereby providing a robust solution for food vulnerability classification on imbalanced datasets.
Co-Authors . Indahwati . Sutoro Aam Alamudi Abd. Rasyid Syamsuri Agus Mohamad Soleh Agus Santoso Aji Hamim Wigena Akbar Rizki Akbar Rizki Akbar Rizki Aki Hirai Anang Kurnia Anggraini Sukmawati Annisa Malik Apino, Ezi Aqmar, Nurzatil Bagus Sartono Budi Susetyo Budi Susetyo Budi Waryanto Budi Waryanto Budi Waryanto Cici Suhaeni Dairul Fuhron Dalimunthe, Amir Abduljabbar Dian Ayuningtyas Eka Setiawaty Erwandi Erwandi fatimah Fatimah Febie Tri Lestari Fitrianto, Anwar H S, Rahmat Handayani, Vitri Aprilla Handayani, Vitri Aprilla Hari Wijayanto Hari Wijayanto Hasibuan, Rafika Aufa Hasnita Hasnita Herdina Kuswari Heri Retnawati Hiroki Takahashi I Made Sumertajaya Ikhlasul Amalia Rahmi Indahwati Indahwati Indahwati Isnan Mulia Itasia Dina Sulvianti Izzati, Fatkhul Kensuke Nakamura Khairil Anwar Notodiputro Koesnandy H, Abialam Kusman Sadik Latifah Kosim Darusman M. Rafi Maya Deanti Maysarah Sabariah Kudadiri Md. Altaf-Ul-Amin . Melati Mochamad Ridwan Mochamad Ridwan, Mochamad Mohammad Masjkur Muchlishah Rosyadah Muhammad Ali Umar Mukhamad Najib Nadhif Nursyahban Nur Hikmah Nur Janah Nur Jannah Nurul Qomariasih Octaviani, Siti Nurfajar Panjaitan, Intan Juliana Pardede, Timbul Pika Silvianti Pika Silvianti Pika Silvianti Puspita, Novi Qomariasih, Nurul Rifqi Aulya Rahman Rizal Bakri Rossi Azmatul Barro Rosyada, Munaya Nikma Rosyadah, Muchlishah Rudi Heryanto Safitri, Wa Ode Rahmalia Septaningsih, Dewi Anggraini Septanti Kusuma Dwi Arini Septiani, Adeline Vinda Shigehiko Kanaya Sulistiyani . Syahrir, Nur Hilal A. Syahrir, Nur Hilal A. Usman, Muhammad Syafiuddin Widhiyanti Nugraheni Widya Putri Nurmawati Winata, Hilma Mutiara Wisnu Ananta Kusuma Zana Aprillia