The retention of customers in the retail banking sector is a critical economic imperative; however, predictive modeling is frequently hindered by severe class imbalance and the “Black Box” nature of complex algorithms. This study proposes a Heterogeneous Stacking Ensemble framework integrating XGBoost, CatBoost, and Random Forest base learners with a Logistic Regression meta-learner to forecast customer attrition. To overcome the pervasive “Majority Class Bias,” we introduce a “Dual-Imbalance Defense” that synergizes the Synthetic Minority Over-sampling Technique (SMOTE) with algorithmic cost-sensitive penalization. Furthermore, moving beyond standard accuracy metrics, the framework mathematically derives a dynamic classification threshold to guarantee a strict 0.90 recall rate, actively optimizing the capture of at-risk capital. Model opacity is addressed through the integration of a SHapley Additive exPlanations (SHAP) TreeExplainer. This cooperative game theory approach provides localized, patient-level “Reason Codes” for regulatory compliance and reveals global systemic vulnerabilities, including non-linear drivers such as the “Product Paradox.” Achieving a 0.90 recall rate and an AUC of 0.8654, this framework provides a statistically robust and operationally transparent tool for targeted customer retention.
Copyrights © 2026