Learning Diverse Ranking Based on Document Clustering and User Clicks

摘要

As the Web develops so rapidly, search engine plays more and more important role in information retrieval on the Web. Ranking is one of the most important parts in search engine system and recent research show that diversity need affects the effectiveness of ranking results and satisfaction of users. Most of the existing diverse ranking algorithms are offline. In this paper, we propose two online algorithms RBADA and 2LRBA for learning diverse ranking, which utilize clustering hypothesis to improve the selection process in document ranking. Although they use the cluster information in different ways, experiment results on a public dataset and a synthetic dataset show that they both outperform the existing online ranking algorithm RBA. Moreover, as the number of documents becomes larger, the diversification performance of RBA declines, while our two algorithms keep relatively stable if the number of clusters stays unchanged.

关键词

Information Retrieval Diversification User Click Clustering Online Algorithm