This research deals with multi-UAV systems to track partially observable multi-targets in noisy three-dimensional environments, which are commonly encountered in defense and surveillance systems. It is a far extension from previous research which focused mainly on two-dimensional, fully observable, and/or perfect measurement settings. The targets are modeled as linear time-invariant systems with Gaussian noise and the pursuers UAV are represented in a standard six-degree-of-freedom model. Necessary equations to describe the relationship between observations regarding the target and the pursuers states are derived and represented as the Gauss-Markov model. Partially observable targets require the pursuers to maintain belief values for target positions. In the presence of a noisy environment, an extended Kalman filter is used to estimate and update those beliefs. A Decentralized Multi-Agent Reinforcement Learning (MARL) algorithm known as soft Double Q-Learning is proposed to learn the coordination control among the pursuers. The algorithm is enriched with an entropy regulation to train a certain stochastic policy and enable interactions among pursuers to foster cooperative behavior. The enrichment encourages the algorithm to explore wider and unknown search areas which is important for multi-target tracking systems. The algorithm was trained before it was deployed to complete several scenarios. The experiments using various sensor capabilities showed that the proposed algorithm had higher success rates compared to the baseline algorithm. A description of the many distinctions between two-dimensional and three-dimensional settings is also provided.
Copyrights © 2024