This paper proposes a Reinforcement Learning-based Model Predictive Controller (RL-MPC) for mobile robots operating in dynamic environments with stringent safety constraints. The key challenges addressed include model and perception uncertainty, moving obstacles, and real-time computational requirements. The proposed framework combines three key components: first, a learned dynamics model with uncertainty estimation to enhance the robustness of the system in uncertain environments; second, a risk-aware MPC that uses chance constraints and Conditional Value-at-Risk (CVaR) to enforce safety by ensuring that violation probabilities remain below a predefined threshold; and third, a Control Barrier Function (CBF) that acts as a safety layer, projecting actions to stay within a predefined safe set. The policy learning, utilizing Proximal Policy Optimization (PPO) or Soft Actor-Critic (SAC), is integrated with reward shaping and safety shielding to ensure that the robot prioritizes safety while achieving performance. Additionally, a sim-to-real strategy with domain randomization is employed to improve robustness when transitioning from simulation to real-world applications. The framework is evaluated in three different scenarios: solid static obstacles, moving obstacles, and multi-agent traffic. The results demonstrate that RL-MPC reduces the safety violation rate to ≤2%, a significant improvement compared to 2.8–12.3% in the baseline. Moreover, RL-MPC increases the minimum distance between the robot and obstacles to approximately 0.2 meters in dynamic scenarios and achieves a success rate of 95–99%, without significantly increasing the path length or energy consumption. Although the computational overhead increases by approximately 3–5 ms compared to classical MPC, the system still meets the 20 ms per cycle requirement, making it suitable for real-time applications. The ablation study confirms the dominant role of the CBF and risk-based constraints in preventing near-collisions, highlighting their crucial contribution to the system's safety. Overall, the RL-MPC framework provides a favorable trade-off between safety, efficiency, and implementation feasibility, offering a promising solution for online autonomous operations in dynamic environments.