摘要
Data collection is an essential part of Beyond-5G and Internet of Things applications. In urban area, heterogeneous access points such as Wi-Fi routers and base stations can meet the required communication coverage and bandwidth in data collection processes. However, in remote area, without communication infrastructures, it is hard to guarantee the communication quality of a large-scale data aggregation network. An existing approach is to use an unmanned aerial vehicle (UAV) to act as a mobile sink to perform data collection and increase the coverage of intelligent wireless sensing and communications. The efficiency and the reliability of such a UAV-assisted data collection system can be significantly enhanced with an intelligent cooperative strategy for the sensors deployed in the field to communicate with the UAV. Furthermore, an energy-efficient trajectory planning algorithm is crucial to address the physical limitations of the UAV in this application. In this paper, a data collection process is modeled as a Markov decision process (MDP). The paper begins with proposing two heuristic greedy algorithms, namely distance-greedy (DG) algorithm and rate-greedy (RG) algorithm, which are designed based on prior knowledge of the system and can guarantee the completion of the data collection process in a remote area without the help of fixed communication infrastructures. Based on the outcomes, a multi-agent greedy-model-based reinforcement learning (MG-RL) algorithm is proposed, which specifically designs the environmental state and the reward scheme, and introduces multiple UAVs with different parameters to explore environments in parallel to accelerate the training. In conclusion, the two proposed greedy algorithms have lower complexity of implementation while the proposed MG-RL algorithm yields practical UAVs' flight trajectories and shortens the time for completing a data collection task.
- 
                                单位东莞理工学院
