摘要
Balanced clustering, which generates clusters of similar sizes, can be useful in a variety of applications. However, existing clustering algorithms either cannot guarantee balanced clustering results or require relatively high time complexities for balanced clustering. In this work, we propose a constrained balanced clustering method, which is referred to as tau-balanced clustering, to generate clusters with a controllable balance degree. The proposed method constrains the cluster sizes in the cluster assignment phase based on an established cluster bound size and an established bound for the number of largest clusters. Second, we optimize the basic tau-balanced clustering method by reducing some unnecessary calculations with two-level filtering. Third, we also design a parallel version for the basic tau-balanced clustering method and the optimized method on GPUs (Graphics Processing Units), to enhance the execution efficiency with high parallelism. Finally, we conduct a series of experiments on nine benchmark datasets to verify the proposed methods. The experimental results show that our methods successfully outperform the state-of-the-art methods.
-
单位y; 桂林电子科技大学