Garuda - Garba Rujukan Digital

Scientific Journal of Informatics

Vol. 12 No. 2: May 2025

Rofik, Rofik (Unknown)
Unjung, Jumanto (Unknown)

Publish Date
07 Jul 2025

Purpose: Customer churn is a crucial issue for companies, especially those in the telecommunications sector, as it has a direct impact on revenue and new customer acquisition costs. The purpose of this research is to create a customer churn prediction model through performance comparison between the Logistic Regression algorithm and Ridge Classifier, considering the effect of data balancing. Methods: This study developed a churn classification model by comparing the Logistic Regression and Ridge Classifier algorithms in three scenarios: without data balancing, balancing using SMOTE, and balancing using GAN. The dataset used was Telco Customer Churn from Kaggle. Model evaluation was performed using a confusion matrix with accuracy, precision, recall, and F1-score metrics, with a primary focus on the accuracy metric. Result: The results show that data balancing using SMOTE and GAN does not improve model accuracy. The highest accuracy was achieved by the Ridge Classifier without data balancing, at 82.47%, followed by Logistic Regression at 82.25%. However, the recall and F1-score metrics improved when using SMOTE. The highest recall was achieved by Ridge Classifier at 75.34% and Logistic Regression at 75.07% in the SMOTE 50:50 scenario. The highest F1-score was also achieved by Ridge Classifier at 64.76% and Logistic Regression at 64.68% followed by the SMOTE 50:30 scenario. Meanwhile, the precision metric tends to decrease after data balancing. Novelty: The uniqueness of this study lies in the comparison of the performance of the Ridge Classifier and Logistic Regression in data balancing scenarios using SMOTE and GAN, which has not been widely discussed in previous studies. The main findings show that the highest accuracy is achieved when the Ridge Classifier model uses original data or without applying SMOTE or GAN data balancing. However, data balancing using SMOTE has been proven to significantly improve the recall and F1-score metrics.

Citation Download

EndNote, Reference Manager, ProCite

Latex, Jabref

Check in Google Scholar

Journal Info

Scientific Journal of Informatics

Website

Abbrev

sji

Publisher

Universitas Negeri Semarang

Subject

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management Electrical & Electronics Engineering Engineering

Description

Scientific Journal of Informatics (p-ISSN 2407-7658 | e-ISSN 2460-0040) published by the Department of Computer Science, Universitas Negeri Semarang, a scientific journal of Information Systems and Information Technology which includes scholarly writings on pure research and applied research in the ...

Article Info

Abstract

Evaluation of Ridge Classifier and Logistic Regression for Customer Churn Prediction on Imbalanced Telecommunication Data

Article Info

Abstract