In the midst of modern financial dynamics, the ability to predict corporate bankruptcy holds strategic significance, as it directly affects economic stability and investor confidence. However, the development of a reliable predictive model is often hindered by the complex nature of financial data, particularly the class imbalance between bankrupt and non-bankrupt companies. This imbalance causes models to become biased toward the majority class, thereby reducing their sensitivity in detecting bankruptcy cases which are, in fact, the most critical for financial decision-making. This research aims to construct a more balanced and sensitive bankruptcy prediction model by specifically addressing the issue of data imbalance. The proposed approach integrates the Random Oversampling (ROS) technique to equalize class distribution, Chi-Square feature selection to identify the most informative financial variables, and the Extreme Gradient Boosting (XGBoost) algorithm as the core predictive model. The dataset used is the UCI Taiwanese Bankruptcy Prediction dataset, consisting of 6,819 observations and 96 financial ratio variables. Experimental results show that the Chi-Square method successfully identified 20 influential variables, including Per Share Net Profit Before, Debt Ratio, and ROA(B) Before Interest and Depreciation After Tax. The proposed XGBoost model achieved an overall accuracy of 0.9648 and an F1-score of 0.4286, demonstrating superior performance. These findings confirm that the combination of ROS, Chi-Square, and XGBoost effectively enhances data balance and prediction sensitivity for the bankruptcy class. This research is expected to serve as a foundation for developing financial decision-support systems capable of providing early warnings of potential corporate bankruptcy.
Copyrights © 2026