his study proposes an intelligent resource orchestration system for AI-driven digital commerce platforms using a reinforcement learning (RL) framework to address growing challenges in dynamic workload management, latency reduction, and service efficiency. Grounded in contemporary advances in machine learning–based cloud orchestration, the research investigates the effectiveness of the Proximal Policy Optimization (PPO) algorithm in optimizing resource allocation under complex, non-stationary platform conditions. A simulation-based experimental design was employed, incorporating real-world platform logs and synthetic workload scenarios to evaluate system responsiveness, throughput, and cost efficiency relative to heuristic and threshold-based baselines. The findings demonstrate that the RL-driven orchestrator consistently outperforms conventional methods, achieving superior latency reductions, improved throughput stability, and enhanced adaptability during peak-load fluctuations. The results further show that the agent effectively learns optimal policies despite environmental uncertainty, validating the feasibility of model-free RL for large-scale digital commerce environments. The study contributes theoretically by extending sequential decision-making models to digital commerce orchestration and practically by offering a scalable, autonomous solution that enhances platform performance. Limitations include the controlled simulation environment and the focus on a single RL algorithm, suggesting the need for real-world deployment and exploration of alternative RL variants in future research. Overall, the study strengthens the case for adopting RL-based orchestration as a foundational architecture for next-generation intelligent digital commerce systems.
Copyrights © 2026