摘要
With the rapid development of cloud computing, the importance of dynamic virtual machine scheduling is increasing. Existing works formulate the VM scheduling as a bin-packing problem and design greedy methods to solve it. However, cloud service providers widely adopt multi-NUMA architecture servers in recent years, and existing methods do not consider the architecture. This paper formulates the multiNUMA VM scheduling into a novel structured combinatorial optimization and transforms it into a reinforcement learning problem. We propose a reinforcement learning algorithm called SchedRL with a delta reward scheme and an episodic guided sampling strategy to solve the problem efficiently. Evaluating on a public dataset of Azure under two different scenarios, our SchedRL outperforms FirstFit and BestFit on the fulfill number and allocation rate.