As autonomous robotic systems are increasingly used in industrial applications, there is a growing need to create efficient and automated decision-making capabilities that can work in complex environments with a range of possible actions. RL offers an effective way to train robotic agents. Still, conventional RL techniques tend to have issues with slow and unstable policy learning, poor convergence, and weak exploration-exploitation balance. To solve this problem, this paper develops a Hybrid optimization approach that incorporates reinforcement learning, deep learning, and metaheuristic optimization for more robust robotic control and adaptability. The new approach utilizes a Deep Q-Network with Experience Replay for learning policies. At the same time, an Adaptive Gradient-Based Sled Dog Optimizer is used to improve and optimize decision-making. Epsilon-greedy selection combined with Noisy Network is used for hybrid exploration-exploitation, which helps learning. The effectiveness of the proposed method was validated against five existing methods, which include Conservative Q-Learning, Behavior Regularized Actor-Critic, Implicit Q-Learning, Twin Delayed Deep Deterministic Policy Gradient, and Soft Actor-Critic, over the three benchmark robotic datasets of MuJoCo, D4RL, and OpenAI Gym Robotics Suite. The vast majority of results provide compelling support for the argument that the proposed approach consistently outperformed the baseline approaches in terms of accuracy, precision, recall, stability, speed of convergence, and degree of generalization. The improvement in performance was confirmed by validation methods such as analyzing confidence intervals and computing results of p-values.