摘要
Filter feature selection methods are utilized to select discriminative terms from high-dimensional text data to improve text classification performance and reduce computational costs. This paper aims to provide a comprehensive systematic review of existing filter feature selection methods for text classification. Firstly, we briefly discuss text classification based on filter feature selection. Secondly, we present a detailed discussion on mathematical designs, effectiveness and complexity of existing filter feature selection methods of different methodologies (supervised methods, unsupervised methods and hybrid methods). In addition, a certain number of benchmark datasets for evaluating performance of filter feature selection methods in text classification are also discussion. Finally, we provide future directions in filter feature selection, along with conclusion.