Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Journal of Electronics, Electromedical Engineering, and Medical Informatics

Comparative Study of Various Hyperparameter Tuning on Random Forest Classification With SMOTE and Feature Selection Using Genetic Algorithm in Software Defect Prediction Suryadi, Mulia Kevin; Herteno, Rudy; Saputro, Setyo Wahyu; Faisal, Mohammad Reza; Nugroho, Radityo Adi
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 2 (2024): April
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i2.375

Abstract

Software defect prediction is necessary for desktop and mobile applications. Random Forest defect prediction performance can be significantly increased with the parameter optimization process compared to the default parameter. However, the parameter tuning step is commonly neglected. Random Forest has numerous parameters that can be tuned, as a result manually adjusting parameters would diminish the efficiency of Random Forest, yield suboptimal results and it will take a lot of time. This research aims to improve the performance of Random Forest classification by using SMOTE to balance the data, Genetic Algorithm as selection feature, and using hyperparameter tuning to optimize the performance. Apart from that, it is also to find out which hyperparameter tuning method produces the best improvement on the Random Forest classification method. The dataset used in this study is NASA MDP which included 13 datasets. The method used contains SMOTE to handle imbalance data, Genetic Algorithm feature selection, Random Forest classification, and hyperparameter tuning methods including Grid Search, Random Search, Optuna, Bayesian (with Hyperopt), Hyperband, TPE and Nevergrad. The results of this research were carried out by evaluating performance using accuracy and AUC values. In terms of accuracy improvement, the three best methods are Nevergrad, TPE, and Hyperband. In terms of AUC improvement, the three best methods are Hyperband, Optuna, and Random Search. Nevergrad on average improves accuracy by about 3.9% and Hyperband on average improves AUC by about 3.51%. This study indicates that the use of hyperparameter tuning improves Random Forest performance and among all the hyperparameter tuning methods used, Hyperband has the best hyperparameter tuning performance with the highest average increase in both accuracy and AUC. The implication of this research is to increase the use of hyperparameter tuning in software defect prediction and improve software defect prediction performance.
Optimizing Software Defect Prediction Models: Integrating Hybrid Grey Wolf and Particle Swarm Optimization for Enhanced Feature Selection with Popular Gradient Boosting Algorithm Angga Maulana Akbar; Herteno, Rudy; Saputro, Setyo Wahyu; Faisal, Mohammad Reza; Nugroho, Radityo Adi
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 2 (2024): April
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i2.388

Abstract

Software defects, also referred to as software bugs, are anomalies or flaws in computer program that cause software to behave unexpectedly or produce incorrect results. These defects can manifest in various forms, including coding errors, design flaws, and logic mistakes, this defect have the potential to emerge at any stage of the software development lifecycle. Traditional prediction models usually have lower prediction performance. To address this issue, this paper proposes a novel prediction model using Hybrid Grey Wolf Optimizer and Particle Swarm Optimization (HGWOPSO). This research aims to determine whether the Hybrid Grey Wolf and Particle Swarm Optimization model could potentially improve the effectiveness of software defect prediction compared to base PSO and GWO algorithms without hybridization. Furthermore, this study aims to determine the effectiveness of different Gradient Boosting Algorithm classification algorithms when combined with HGWOPSO feature selection in predicting software defects. The study utilizes 13 NASA MDP dataset. These dataset are divided into testing and training data using 10-fold cross-validation. After data is divided, SMOTE technique is employed in training data. This technique generates synthetic samples to balance the dataset, ensuring better performance of the predictive model. Subsequently feature selection is conducted using HGWOPSO Algorithm. Each subset of the NASA MDP dataset will be processed by three boosting classification algorithms namely XGBoost, LightGBM, and CatBoost. Performance evaluation is based on the Area under the ROC Curve (AUC) value. Average AUC values yielded by HGWOPSO XGBoost, HGWOPSO LightGBM, and HGWOPSO CatBoost are 0.891, 0.881, and 0.894, respectively. Results of this study indicated that utilizing the HGWOPSO algorithm improved AUC performance compared to the base GWO and PSO algorithms. Specifically, HGWOPSO CatBoost achieved the highest AUC of 0.894. This represents a 6.5% increase in AUC with a significance value of 0.00552 compared to PSO CatBoost, and a 6.3% AUC increase with a significance value of 0.00148 compared to GWO CatBoost. This study demonstrated that HGWOPSO significantly improves the performance of software defect prediction. The implication of this research is to enhance software defect prediction models by incorporating hybrid optimization techniques and combining them with gradient boosting algorithms, which can potentially identify and address defects more accurately