摘要

The selection of discriminative terms from large quantity of terms in text documents is helpful for achieving better accuracy of text classification. To focus on the task of selecting discriminative terms from text, a deep learning based feature selection method is proposed. The method is developed by using the long short term memory (LSTM) network. A deep network based on LSTM is trained in unsupervised manner to extracted deep features from bag-of-words term frequency vectors. The deep features are integrated with term frequencies to evaluate the effectiveness of terms. The proposed method extends the limitation of term frequency information by applying deep features for feature selection. Experiments in nine public datasets demonstrate better performance of our method in selecting discriminative terms than comparative methods.

全文