Jurnal Teknik Informatika (JUTIF)
Vol. 6 No. 6 (2025): JUTIF Volume 6, Number 6, Desember 2025

Comparative Performance Evaluation of Linear, Bagging, and Boosting Models Using BorutaSHAP for Software Defect Prediction on NASA MDP Datasets

Kartika, Najla Putri (Unknown)
Herteno, Rudy (Unknown)
Budiman, Irwan (Unknown)
Nugrahadi, Dodon Turianto (Unknown)
Abadi, Friska (Unknown)
Ahmad, Umar Ali (Unknown)
Faisal, Mohammad Reza (Unknown)



Article Info

Publish Date
05 Jan 2026

Abstract

Software defect prediction aims to identify potentially defective modules early on in order to improve software reliability and reduce maintenance costs. However, challenges such as high feature dimensions, irrelevant metrics, and class imbalance often reduce the performance of prediction models. This research aims to compare the performance of three classification model groups—linear, bagging, and boosting—combined with the BorutaSHAP feature selection method to improve prediction stability and interpretability. A total of twelve datasets from the NASA Metrics Data Program (MDP) were used as test references. The research stages included data preprocessing, class balancing using the Synthetic Minority Oversampling Technique (SMOTE), feature selection with BorutaSHAP, and model training using five algorithms, namely Logistic Regression, Linear SVC, Random Forest, Extra Trees, and XGBoost. The evaluation was conducted with Stratified 5-Fold Cross-Validation using the F1-score and Area Under the Curve (AUC) metrics. The experimental results showed that tree-based ensemble models provided the most consistent performance, with Extra Trees recording the highest average AUC of 0.794 ± 0.05, followed by Random Forest (0.783 ± 0.06). The XGBoost model provided the best results on the PC4 dataset (AUC = 0.937 ± 0.008), demonstrating its ability to handle complex data patterns. These findings prove that BorutaSHAP is effective in filtering relevant features, improving classification reliability, and strengthening transparency and interpretability in the Explainable Artificial Intelligence (XAI) framework for software quality improvement.

Copyrights © 2025






Journal Info

Abbrev

jurnal

Publisher

Subject

Computer Science & IT

Description

Jurnal Teknik Informatika (JUTIF) is an Indonesian national journal, publishes high-quality research papers in the broad field of Informatics, Information Systems and Computer Science, which encompasses software engineering, information system development, computer systems, computer network, ...