Customer churn is a critical issue for telecommunications companies, as it directly impacts revenue and business sustainability. This study proposes the development of a churn prediction model using the Extreme Gradient Boosting (XGBoost) algorithm combined with the Boruta feature selection method and SHAP (SHapley Additive exPlanations)-based feature interpretation. The dataset used is the Telco Customer Churn dataset from Kaggle, consisting of 7,043 customer records and 21 features. The research stages include data preprocessing, data transformation, an 80:20 train-test split, data balancing using SMOTE, feature selection with Boruta, feature interpretation with SHAP, and classification using XGBoost. The model’s performance was evaluated using accuracy, precision, recall, and F1-score metrics. Results show that the XGBoost model with Boruta-SHAP (Model B) achieved an accuracy of 0.7576, slightly higher than the model without feature selection (Model A), which achieved 0.7512. Model B also demonstrated improved performance for the majority class (non-churn), with recall increasing from 0.76 to 0.79 and F1-score from 0.82 to 0.83. However, for the minority class (churn), recall decreased from 0.72 to 0.66, although precision increased from 0.52 to 0.54. These findings indicate that integrating Boruta-SHAP can enhance model efficiency and interpretability, but additional strategies are required to maintain performance for the minority class.
                        
                        
                        
                        
                            
                                Copyrights © 2025