Claim Missing Document
Check
Articles

Found 1 Documents
Search

Comparative Study of Various Hyperparameter Tuning on Random Forest Classification With SMOTE and Feature Selection Using Genetic Algorithm in Software Defect Prediction Suryadi, Mulia Kevin; Herteno, Rudy; Saputro, Setyo Wahyu; Faisal, Mohammad Reza; Nugroho, Radityo Adi
Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol 6 No 2 (2024): April
Publisher : Department of Electromedical Engineering, POLTEKKES KEMENKES SURABAYA

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/jeeemi.v6i2.375

Abstract

Software defect prediction is necessary for desktop and mobile applications. Random Forest defect prediction performance can be significantly increased with the parameter optimization process compared to the default parameter. However, the parameter tuning step is commonly neglected. Random Forest has numerous parameters that can be tuned, as a result manually adjusting parameters would diminish the efficiency of Random Forest, yield suboptimal results and it will take a lot of time. This research aims to improve the performance of Random Forest classification by using SMOTE to balance the data, Genetic Algorithm as selection feature, and using hyperparameter tuning to optimize the performance. Apart from that, it is also to find out which hyperparameter tuning method produces the best improvement on the Random Forest classification method. The dataset used in this study is NASA MDP which included 13 datasets. The method used contains SMOTE to handle imbalance data, Genetic Algorithm feature selection, Random Forest classification, and hyperparameter tuning methods including Grid Search, Random Search, Optuna, Bayesian (with Hyperopt), Hyperband, TPE and Nevergrad. The results of this research were carried out by evaluating performance using accuracy and AUC values. In terms of accuracy improvement, the three best methods are Nevergrad, TPE, and Hyperband. In terms of AUC improvement, the three best methods are Hyperband, Optuna, and Random Search. Nevergrad on average improves accuracy by about 3.9% and Hyperband on average improves AUC by about 3.51%. This study indicates that the use of hyperparameter tuning improves Random Forest performance and among all the hyperparameter tuning methods used, Hyperband has the best hyperparameter tuning performance with the highest average increase in both accuracy and AUC. The implication of this research is to increase the use of hyperparameter tuning in software defect prediction and improve software defect prediction performance.