This study aims to compare the performance of several machine learning algorithms Logistic Regression, Decision Tree, Random Forest, and XGBoost in predicting corporate bankruptcy based on financial ratio data, and to provide recommendations regarding their practical applicability. The research employed a dummy dataset sourced from Kaggle, which was preprocessed through cleaning, transformation, and encoding. Data were divided into training (80%) and testing (20%) subsets using stratified sampling to address class imbalance. Model performance was evaluated using accuracy, precision, recall, and F1-score. The results reveal that all models demonstrated high predictive capability, with Random Forest achieving the highest accuracy (96.7%), closely followed by XGBoost (≈96%). Logistic Regression (96%) and Decision Tree (95%) also showed strong results. Profitability and leverage indicators, particularly Net Income to Total Assets and debt ratios, emerged as the most influential predictors. The findings underscore that ensemble tree-based methods offer marginally superior performance due to their ability to capture non-linear interactions, while logistic regression remains valuable for interpretability. The use of a dummy dataset limits the direct generalizability of the findings to real-world financial systems. Moreover, reliance on a single train–test split may overestimate model stability. Future research should employ real-world datasets, apply cross-validation techniques, and explore explainability methods such as SHAP or LIME to enhance transparency. This study contributes by providing a comparative evaluation of machine learning models in bankruptcy prediction, highlighting both accuracy and interpretability trade-offs. The results offer practical insights for auditors, investors, and regulators in selecting predictive tools that balance performance, transparency, and decision-making needs.
Copyrights © 2025