Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control
Vol. 5, No. 3, August 2020

The Comparison of Imbalanced Data Handling Method in Software Defect Prediction

Khadijah, Khadijah (Unknown)
Sasongko, Priyo Sidik (Unknown)



Article Info

Publish Date
15 Aug 2020

Abstract

Software testing is a crucial process in software development life cycle which will affect the software quality. However, testing is a tedious task and resource consuming. Software testing can be conducted more efficiently by focusing this activitiy to software modules which is prone to defect. Therefore, an automated software defect prediction is needed. This research implemented Extreme Learning Machine (ELM) as classification algorithm because of its simplicity in training process and good generalization performance. Aside classification algorithm, the most important problem need to be addressed is imbalanced data between samples of positive class (prone to defect) and negative class. Such imbalance problem could bias the performance of classifier. Therefore, this research compared some approaches to handle imbalance problem between SMOTE (resampling method) and weighted-ELM (algorithm-level method).The results of experiment using 10-fold cross validation on NASA MDP dataset show that including imbalance problem handling in building software defect prediction model is able to increase the specificity and g-mean of model. When the value of imbalance ratio is not very small, the SMOTE is better than weighted-ELM. Otherwise, weighted-ELM is better than SMOTE in term of sensitivity and g-mean, but worse in term of specificity and accuracy.Software testing is a crucial process in software development life cycle which will affect the software quality. However, testing is a tedious task and resource consuming. Software testing can be conducted more efficiently by focusing this activitiy to software modules which is prone to defect. Therefore, an automated software defect prediction is needed. This research implemented Extreme Learning Machine (ELM) as classification algorithm because of its simplicity in training process and good generalization performance. Aside classification algorithm, the most important problem need to be addressed is imbalanced data between samples of positive class (prone to defect) and negative class. Such imbalance problem could bias the performance of classifier. Therefore, this research compared some approaches to handle imbalance problem between SMOTE (resampling method) and weighted-ELM (algorithm-level method).The results of experiment using 10-fold cross validation on NASA MDP dataset show that including imbalance problem handling in building software defect prediction model is able to increase the specificity and g-mean of model. When the value of imbalance ratio is not very small, the SMOTE is better than weighted-ELM. Otherwise, weighted-ELM is better than SMOTE in term of sensitivity and g-mean, but worse in term of specificity and accuracy.

Copyrights © 2020






Journal Info

Abbrev

kinetik

Publisher

Subject

Computer Science & IT Control & Systems Engineering Electrical & Electronics Engineering Energy Engineering

Description

Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control was published by Universitas Muhammadiyah Malang. journal is open access journal in the field of Informatics and Electrical Engineering. This journal is available for researchers who want to improve ...