International Journal of Advances in Data and Information Systems
Vol. 6 No. 1 (2025): April 2025 - International Journal of Advances in Data and Information Systems

Predicting Software Defects at Package Level in Java Project Using Stacking of Ensemble Learning Approach

Zahra, Nabila Athifah (Unknown)
Arifiyanti, Amalia Anjani (Unknown)
Kartika, Dhian Satria Yudha (Unknown)



Article Info

Publish Date
22 Apr 2025

Abstract

Compared to manual and automated testing, AI-driven testing provides a more intelligent approach by enabling earlier prediction of software defects and improving testing efficiency. This research focuses on predicting software defects by analyzing CK software metrics using classification algorithms. A total of 8924 data points were collected from five open-source Java projects on GitHub. Due to class imbalance, undersampling was applied during preprocessing along with data cleaning and normalization. The final dataset consists of 1314 instances (746 clean and 568 buggy). The predictive model is developed in two stages: base learner (level-0) using AdaBoost, Random Forest (RF), Extra Trees (ET), Gradient Boosting (GB), Histogram-based Gradient Boosting (HGB), XGBoost (XGB), and CatBoost (CAT) algorithms, and meta-learner (level-1) that optimizes the results using ensemble stacking techniques. The stacking model achieved an ROC-AUC score of 0.8575, outperforming all individual classifiers and effectively distinguishing defective from non-defective software components. The comparison of performance improvements between the base model (tree-based ensemble) and stacking was statistically validated using paired t-tests. All p-values were below 0.05, confirming the significance of Stacking’s superior performance, with the largest gain observed against Gradient Boosting (+0.0411, p = 0.0030). The confusion matrix of stacking model is the most optimal model because it has high of True Positive and True Negative, while  False Positive and False Negative values are relatively low. These findings affirm that ensemble stacking yields a more robust and balanced classification system, enhancing defect prediction accuracy and enabling earlier issue detection in the Software Development Life Cycle (SDLC). 

Copyrights © 2025






Journal Info

Abbrev

IJADIS

Publisher

Subject

Computer Science & IT Electrical & Electronics Engineering

Description

International Journal of Advances in Data and Information Systems (IJADIS) (e-ISSN: 2721-3056) is a peer-reviewed journal in the field of data science and information system that is published twice a year; scheduled in April and October. The journal is published for those who wish to share ...