Software Defect Prediction (SDP) is a vital process in modern software engineering aimed at identifying faulty components in the early stages of development. In this study, we conducted a comprehensive evaluation of two widely employed SDP approaches, Within-Project Software Defect Prediction (WP-SDP) and Cross-Project Software Defect Prediction (CP-SDP), using identical preprocessing steps to ensure an objective comparison. We utilized the NASA MDP dataset, where each project was split into 70% training and 30% testing data, and applied three distinct resampling strategies—no sampling, oversampling, and undersampling—to address the challenge of class imbalance. Five classification algorithms were examined, including Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting (GB), XGBoost (XGB), and LightGBM (LGBM). Performance was measured primarily using Accuracy and Area Under the Curve (AUC) metrics, resulting in 360 experimental outcomes. Our findings revealed that WP-SDP, combined with oversampling and Random Forest, demonstrated superior predictive capability on most projects, achieving an Accuracy of 89.92% and an AUC of 0.931 on PC4. Nonetheless, CP-SDP excelled in certain small-scale projects (e.g., MW1), underscoring its potential when local historical data is scarce but inter-project characteristics remain sufficiently similar. This study’s results underscore the importance of selecting a prediction scheme tailored to specific project attributes, class imbalance levels, and available historical data. By establishing a standardized methodological framework, our work contributes to a clearer understanding of the strengths and limitations of WP-SDP and CP-SDP, paving the way for more effective defect detection strategies and improved software quality.
Copyrights © 2025