A Graph-based Feature Selection Method for Learning to Rank Using Spectral Clustering for Redundancy Minimization and Biased PageRank for Relevance Analysis

Jen-Yuan Yeh1 and Cheng-Jung Tsai2

  1. Dept. of Operation, Visitor Service, Collection and Information Management, National Museum of Natural Science
    No. 1, Guanqian Rd., North Dist., Taichung City 404, Taiwan (R.O.C.)
    jenyuan@nmns.edu.tw
  2. Graduate Institute of Statistics and Information Science, National Changhua University of Education
    No. 1, Jinde Rd., Changhua City, Changhua County 500, Taiwan (R.O.C.)
    cjtsai@cc.ncue.edu.tw

Abstract

This paper addresses the feature selection problem in learning to rank (LTR). We propose a graph-based feature selection method, named FS-SCPR, which comprises four steps: (i) use ranking information to assess the similarity between features and construct an undirected feature similarity graph; (ii) apply spectral clustering to cluster features using eigenvectors of matrices extracted from the graph; (iii) utilize biased PageRank to assign a relevance score with respect to the ranking problem to each feature by incorporating each feature’s ranking performance as preference to bias the PageRank computation; and (iv) apply optimization to select the feature from each cluster with both the highest relevance score and most information of the features in the cluster. We also develop a new LTR for information retrieval (IR) approach that first exploits FS-SCPR as a preprocessor to determine discriminative and useful features and then employs Ranking SVM to derive a ranking model with the selected features. An evaluation, conducted using the LETOR benchmark datasets, demonstrated the competitive performance of our approach compared to representative feature selection methods and state-of-the-art LTR methods.

Key words

Feature selection; Feature similarity graph; Spectral clustering; Biased PageRank; Learning to rank; Information retrieval

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS201220042Y

How to cite

Yeh, J., Tsai, C.: A Graph-based Feature Selection Method for Learning to Rank Using Spectral Clustering for Redundancy Minimization and Biased PageRank for Relevance Analysis. Computer Science and Information Systems, https://doi.org/10.2298/CSIS201220042Y