Mutation strategy selection along with parameter settings are well known challenges in enhancing the performance of differential evolution (DE). In this paper, we propose to solve these problems as a parametrized action Markov decision process. A multi-pass deep Q-network (MP-DQN) is used as the reinforcement learning method in the parametrized action space. The architecture of MP-DQN comprises an actor network and a Q-network, both trained offline. The networks’ weights are trained based on the samples of states, actions and rewards collected on every DE iterations. We use 99 features to describe a state of DE and experiment on 4 reward definitions. A benchmark study is carried out with functions from CEC2005 to compare the performance of the proposed method to baseline DE methods without any parameter control, with random scaling factor, and to other DEs with adaptive operator selection methods, as well as to the two winners of CEC2005. The results show that DE with MP-DQN parameter control performs better than the baseline DE methods and obtains competitive results compared to the other methods.
Copyrights © 2025