Syntax Literate: Jurnal Ilmiah Indonesia
Jurnal Ilmiah Indonesia

Applying SMOTE-NC on CART Algorithm to Handle Imbalanced Data in Customer Churn Prediction: A Case Study of Telecommunications Industry

Ilma Amira Rahmayanti (Statistics Study Program, Faculty of Science and Technology, University of Airlangga, Surabaya, Indonesia)
Sediono Sediono (Statistics Study Program, Faculty of Science and Technology, University of Airlangga, Surabaya, Indonesia)
Toha Saifudin (Statistics Study Program, Faculty of Science and Technology, University of Airlangga, Surabaya, Indonesia)
Elly Ana (Statistics Study Program, Faculty of Science and Technology, University of Airlangga, Surabaya, Indonesia)



Article Info

Publish Date
22 Dec 2021

Abstract

These days, telecommunications is very much needed in all areas of life. This condition has made the competition among the company is extremely tense. One strategic way to protect the company is to retain existing customers. The retention program as a scheme to retain customers must be implemented precisely and efficiently so that the company can maintain as many customers as possible. In this case, customer churn prediction holds an essential role. However, the existence of imbalanced data can increase prediction errors and create problems. Hence, in order to overcome the issue, this study combined the Synthetic Minority Oversampling Technique – Nominal Continuous (SMOTE-NC) with Classification and Regression Trees (CART). SMOTE-NC was applied to balance classes on training data, while CART formed a classification tree from those balanced data. Then, this classification tree created by CART algorithm had become the basis for predicting customer churn. The data used in this study are from https://community.ibm.com/, where the variables are related to customer demographics, customer contracts, usage history, and customer status of one of the telecom companies. Based on the analysis of these data, SMOTE-NC and CART combination succeeded in reducing errors in predicting customer churn, which also led recall value to increase by approximately 19%. Moreover, the accuracy generated from this combination method was still in a pretty good range of over 75%. Therefore, this study proposes an excellent way to improve the performance of churn prediction, especially in the telecommunications industry.

Copyrights © 2021






Journal Info

Abbrev

syntax-literate

Publisher

Subject

Humanities Education Environmental Science Law, Crime, Criminology & Criminal Justice Social Sciences Other

Description

Syntax Literate: Jurnal Ilmiah Indonesia is a peer-reviewed scientific journal that publishes original research and critical studies in various fields of science, including education, social sciences, humanities, economics, and engineering. The journal aims to provide a platform for researchers, ...