Abstract—This study presents a cross-project software defect prediction (CSDP) framework combining feature harmonization, CORAL-based domain adaptation, SMOTE balancing, PCA reduction, and ensemble classifiers: Random Forest, Logistic Regression, XGBoost, AdaBoost, and VotingClassifier. Evaluations on five AEEEM datasets (JDT, EQ, PDE, Lucene, Mylyn) in both single-source and multi-source settings show consistent improvements over baseline methods. While not outperforming deep learning models, the approach remains practical and interpretable for real-world CSDP tasks.
Copyrights © 2025