Learning soft threshold for sparse reparameterization using gradual projection operators
摘要
Deep neural networks (DNNs) have achieved great success in the field of computer vision in recent years. While being high-precision, the characteristic of over-parameterization impedes DNNs from being applied to lightweight devices. To obtain parameter-efficient networks, a large body of work based on uniform sparsity or heuristic non-uniform sparsity techniques has been explored. However, these spar-sity techniques offer limited improvement in inference speed (FLOPs) and prediction accuracy. To make further progress, we propose a novel gradual projection operators (GPO) to learn the soft threshold for sparse reparameterization. GPO approaches the soft-threshold operator with a family of projection oper-ators, which progressively reduces the gradient of weights to be pruned during training, to gently learn the pruning thresholds. Experiments on ImageNet show that ResNet-50 with the proposed training algo-rithm achieves 76.52% top-1 validation accuracy at the sparsity 81.62%, which has merely a 0.5% accuracy gap to its dense counterpart. Additionally, the non-uniform budgets learned by GPO can reduce the FLOPs by up to 10% compared to the state-of-the-arts, which is superior to the popular heuristics methods, thus yielding an effective mechanism for sparse reparameterization.
