摘要
In many practical applications, it is crucial to perform automatic data clustering without knowing the number of clusters in advance. The evolutionary computation paradigm is good at dealing with this task, but the existing algorithms encounter several deficiencies, such as the encoding redundancy and the cross-dimension learning error. In this article, we propose a novel elastic differential evolution algorithm to solve automatic data clustering. Unlike traditional methods, the proposed algorithm considers each clustering layout as a whole and adapts the cluster number and cluster centroids inherently through the variable-length encoding and the evolution operators. The encoding scheme contains no redundancy. To enable the individuals of different lengths to exchange information properly, we develop a subspace crossover and a two-phase mutation operator. The operators employ the basic method of differential evolution and, in addition, they consider the spatial information of cluster layouts to generate offspring solutions. Particularly, each dimension of the parameter vector interacts with its correlated dimensions, which not only adapts the cluster number but also avoids the cross-dimension learning error. The experimental results show that our algorithm outperforms the state-of-the-art algorithms that it is able to identify the correct number of clusters and obtain a good cluster validation value.
-
单位中山大学