Garuda - Garba Rujukan Digital

Article Per Year (5 Year)

p-Index From 2021 - 2026

0.23

P-Index

This Author published in this journals

All Journal Journal of Applied Data Sciences

Chau Dinh Linh

Ho Chi Minh University of Banking

Author-ID : 9846777

Computer Science & IT Control & Systems Engineering Decision Sciences, Operations Research & Management

Published : 1 Documents Claim Missing Document

Claim Missing Document

Articles

Forecasting Bank Efficiency Using Data Envelopment Analysis with Directional Distance Functions and Machine Learning: Time-Series Validation and Shapley Value Interpretation Chau Dinh Linh
Journal of Applied Data Sciences Vol 7, No 2: May 2026
Publisher : Bright Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47738/jads.v7i2.1244

This study develops a structured framework to forecast the operational efficiency of commercial banks in Vietnam. The analysis is based on a balanced panel of 27 banks over the period 2016–2024. Bank efficiency is first measured using a directional distance function within a data envelopment analysis framework (DEA – DDF). This approach incorporates both desirable outputs and undesirable outputs related to credit risk. The estimated efficiency scores are then used as prediction targets in several machine learning models. Model performance is evaluated under both conventional test settings and time-series cross-validation, and predictions are interpreted using Shapley value–based analysis (SHAP). Under a conventional test set, the gradient boosting model (XGBoost) shows the best performance, with a root mean squared error of 0.060 and a coefficient of determination (R²) of 0.353. However, when time-series cross-validation is applied to reflect realistic forecasting conditions, predictive accuracy declines sharply. The average coefficient of determination falls to approximately 0.005. This suggests that static validation can overstate performance and that forecasting efficiency in a changing financial environment remains difficult. The interpretation results provide additional insights. Net interest margin has a positive effect on predicted efficiency, although the effect weakens at very high levels. The cost-to-income ratio shows a threshold around 0.55, beyond which efficiency declines more strongly. Bank size has a largely neutral impact. The interaction between capital adequacy and profitability shows a conditionally negative pattern. Prediction errors are larger in the most recent year and among banks with very high efficiency scores. In summary, the results highlight both the potential and the limitations of machine learning in forecasting efficiency and emphasize the importance of time-aware validation.

Co-Authors

Title

Found 1 Documents
Search

Abstract

Title Search

Found 1 Documents Search

Abstract

Title

Found 1 Documents
Search