Two-order cooperative optimization of swarm control based on reinforcement learning
摘要
This paper presents a study of the cooperative optimal swarm control problem for two-order multi-agent systems with partially unknown nonlinear functions. Unlike traditional approaches that consider a single error, this paper proposes to use multi-order errors in the performance index function to achieve optimal control performance. Additionally, different proportional coefficients are assigned to illustrate the varying influences of each sequence error, and a two-order cooperative (TOC)performance index function is designed. To address the influence of unknown nonlinear functions, a swarm control system based on sliding mode control with an actor-critic network is constructed, which increases the applicability of the proposed method to a variety of dynamic models. Furthermore, to alleviate the computational pressure caused by the multi-order errors in the TOC performance index function, a new reinforcement learning (RL)-based sliding mode swarm controller is designed. The stability of the proposed controller is demonstrated using the Lyapunov function. Finally, the control model and control rate are applied to a quadrotor unmanned aerial vehicle system, and simulation results demonstrate that the multi-agent systems can effectively achieve swarm control.Impact Statement: This paper proposes a reinforcement learning-based sliding mode control strategy for the cooperative optimal swarm control problem, where the nonlinear functions of two-order multi-agent systems are only partially known. In addition, we also propose a cooperative performance index function, which takes into account multi-order errors for optimizing the performance. This contribution is significant for research in sliding mode control strategies and error co-optimization. @@@ In this paper, we propose a reinforcement learning based sliding mode control strategy for the cooperative optimal swarm control problem where the nonlinear functions of two-order multi-agent systems are partially unknown. In addition, we also propose a two-order cooperative performance index function, the performance function can be optimized according to the multi-order errors at the same time to achieve the purpose of cooperative optimization. This article is very helpful for the research of sliding mode control strategy and error co-optimization.image
