Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics
Vol. 7 No. 3 (2025): August

An Empirical Study of Cross-Project and Within-Project Performance in Software Defect Prediction Models Using Tree-Based and Boosting Classifiers

Raidra Zeniananto (Unknown)
Herteno, Rudy (Unknown)
Radityo Adi Nugroho (Unknown)
Andi Farmadi (Unknown)
Setyo Wahyu Saputro (Unknown)



Article Info

Publish Date
20 Aug 2025

Abstract

Software Defect Prediction (SDP) is a vital process in modern software engineering aimed at identifying faulty components in the early stages of development. In this study, we conducted a comprehensive evaluation of two widely employed SDP approaches, Within-Project Software Defect Prediction (WP-SDP) and Cross-Project Software Defect Prediction (CP-SDP), using identical preprocessing steps to ensure an objective comparison. We utilized the NASA MDP dataset, where each project was split into 70% training and 30% testing data, and applied three distinct resampling strategies—no sampling, oversampling, and undersampling—to address the challenge of class imbalance. Five classification algorithms were examined, including Support Vector Machine (SVM), Random Forest (RF), Gradient Boosting (GB), XGBoost (XGB), and LightGBM (LGBM). Performance was measured primarily using Accuracy and Area Under the Curve (AUC) metrics, resulting in 360 experimental outcomes. Our findings revealed that WP-SDP, combined with oversampling and Random Forest, demonstrated superior predictive capability on most projects, achieving an Accuracy of 89.92% and an AUC of 0.931 on PC4. Nonetheless, CP-SDP excelled in certain small-scale projects (e.g., MW1), underscoring its potential when local historical data is scarce but inter-project characteristics remain sufficiently similar. This study’s results underscore the importance of selecting a prediction scheme tailored to specific project attributes, class imbalance levels, and available historical data. By establishing a standardized methodological framework, our work contributes to a clearer understanding of the strengths and limitations of WP-SDP and CP-SDP, paving the way for more effective defect detection strategies and improved software quality.

Copyrights © 2025






Journal Info

Abbrev

ijeeemi

Publisher

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management Electrical & Electronics Engineering Health Professions Materials Science & Nanotechnology

Description

Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics (IJEEEMI) publishes peer-reviewed, original research and review articles in an open-access format. Accepted articles span the full extent of the Electronics, Biomedical, and Medical Informatics. IJEEEMI seeks to ...