This study analyzes the performance of Extreme Gradient Boosting (XGBoost) algorithm in handling missing data for telecommunications customer churn prediction. The research objective is to compare the effectiveness of various missing data imputation techniques (mean, k-NN, and MICE) on XGBoost performance using the IBM Telco Customer Churn dataset. The research methodology includes data preprocessing, implementation of imputation techniques, XGBoost model training, and evaluation using accuracy, precision, recall, and F1-score metrics. The results show that MICE imputation technique provides the best performance improvement with 81.24% accuracy, 69.80% precision, 58.40% recall, and 63.60% F1-score, compared to XGBoost without imputation achieving 79.43% accuracy. These findings demonstrate that explicit missing data handling can enhance XGBoost's predictive capability in identifying potential churning customers. The practical implications of this research provide guidance for telecommunications industry in optimizing customer retention strategies through more accurate churn prediction
Copyrights © 2026