This study addresses the challenge of optimizing ride-hailing dispatch and repositioning under data limitations by proposing an end-to-end digital-twin dispatching framework that integrates spatio-temporal demand forecasting with offline reinforcement learning. Using publicly available NYC FOIL ride-hailing data aggregated at the dispatching-base level, the research aims to evaluate whether coarse-grained data can still support reliable, reproducible decision-making pipelines. The methodology consists of two main components: (i) multivariate time-series forecasting using baseline models, a temporal convolutional network (TCN), and a spatio-temporal transformer to predict next-day demand; and (ii) a digital-twin simulation combined with an action-constrained offline reinforcement learning approach, including behavior cloning (BC) and Conservative Q-Learning (CQL), to optimize fleet repositioning decisions. Experimental results show that the TCN achieves the best forecasting accuracy on the test period, although dominant demand regions largely drive performance gains. In the control phase, conservative policies such as CQL demonstrate stable performance with reduced repositioning costs, but do not significantly outperform behavior cloning due to limited training data. The findings indicate that, in coarse aggregate settings, operational improvements are more influenced by controlling policy sensitivity than by marginal forecasting gains. This study contributes a reproducible benchmark pipeline and highlights the importance of conservative control strategies, transparent assumptions, and sensitivity analysis when deploying AI-driven mobility systems based on limited or aggregated data.
Copyrights © 2025