Journal of Vocational, Informatics and Computer Education
Vol 4, No 2 (2026): June 2026

Explainable Clinical-Operational Intelligence for Hospital Length of Stay Prediction Using Integrated Multi-Source Admission Data with Time-Based Evaluation

Dwi Putro Sarwo Setyohadi (Politeknik Negeri Jember)
Hendra Yufit Riskiawan (Politeknik Negeri Jember)
Aji Seto Arifianto (Politeknik Negeri Jember)
I Gede Wiryawan (Politeknik Negeri Jember)
Akas Bagus Setiawan (Politeknik Negeri Jember)



Article Info

Publish Date
01 Jun 2026

Abstract

Purpose - Hospital length of stay (LOS) affects bed turnover, discharge planning, staffing, and capacity. Integrated hospital data can strengthen LOS prediction and support decision-making. This study developed an explainable clinical-operational intelligence framework for LOS prediction using integrated admission data. Methods - The dataset comprised 45,000 admissions with supporting patient, diagnostic, prescription, billing, ward, bed, staff, and insurance records. It is based on a structured simulation designed to resemble the operational data of hospitals. An admission-level master table was constructed from demographic, temporal, clinical, pharmaceutical, insurance, operational, and patient history features. Length of stay (LOS) regression and high-risk LOS classification were evaluated using a temporal split of 2020-2023 for training, 2024 for validation, and 2025 for testing. Ridge, Random Forest, XGBoost, and CatBoost were compared, followed by threshold optimization, label screening, and SHAP analysis. Findings – CatBoost achieved the best LOS regression performance, with a test MAE of 1.606, an RMSE of 2.028, and an R2 of 0.614. For classification, very_high_los_q90 produced the most balanced extreme-risk formulation, with an accuracy of 0.885 and ROC-AUC of 0.802, whereas high_los_q75 yielded a recall of 0.998 and an F1-score of 0.604. SHAP indicated that prior admission history, diagnostic burden, medication-related features, and ward-level context were prominent drivers of LOS. Research implications – Integrated hospital data are useful for detecting prolonged and extreme LOS, supporting better hospital planning and resource management Originality – This study offers an explainable modeling approach using integrated admission data to support LOS prediction and hospital analytics

Copyrights © 2026






Journal Info

Abbrev

VOICE

Publisher

Subject

Computer Science & IT Education

Description

1. Informatics and Computing Research addressing the design, development, implementation, and evaluation of computing technologies relevant to educational, professional, and digital learning environments, including but not limited to: Artificial Intelligence and Machine Learning Deep Learning and ...