ScholarMate
客服热线:400-1616-289

Autonomous target tracking of multi-UAV: A two-stage deep reinforcement learning approach with expert experience

Wang, Jiahua; Zhang, Ping*; Wang, Yang
Science Citation Index Expanded
贵州大学

摘要

In recent years, deep reinforcement learning (DRL) has developed rapidly and has been applied to multi-UAV target tracking (MTT) research. However, DRL still faces challenges in data utilization and learning speed. To better solve the above problems, a novel two-stage DRL-based multi-UAV decision-making method is proposed in this paper. Specifically, a sample generator combining artificial potential field with proportional-integral-derivative is used to produce expert experience data. On this basis, a two-stage reinforcement learning training method is introduced. For the first stage, the policy network and critic network are pre-trained using expert data, combined with behavior cloning loss and additional Q-value loss, which reduces ineffective exploration and speeds up learning. For the second RL stage, by calculating the average return of the last recent k excellent episodes, the excellent experience generated by the agent itself is screened out and used to guide the policy network to choose the actions with high reward, thus improving the efficiency of data utilization. Extensive simulation experiments show that our method not only enables multi-UAV to continuously track the target in obstacle environments but also significantly improves the learning speed and convergence effect.& COPY; 2023 Elsevier B.V.

关键词

Multi-UAV DRL TD3 Expert experience Target tracking