The Community for Technology Leaders
RSS Icon
Issue No.06 - June (2013 vol.62)
pp: 1221-1233
Hanjiang Lai , Sun Yat-sen University, Guangzhou
Yan Pan , Sun Yat-sen University, Guangzhou
Cong Liu , Sun Yat-sen University, Guangzhou
Liang Lin , Sun Yat-sen University, Guangzhou
Jie Wu , Temple University, Philadelphia
Learning-to-rank for information retrieval has gained increasing interest in recent years. Inspired by the success of sparse models, we consider the problem of sparse learning-to-rank, where the learned ranking models are constrained to be with only a few nonzero coefficients. We begin by formulating the sparse learning-to-rank problem as a convex optimization problem with a sparse-inducing $(\ell_1)$ constraint. Since the $(\ell_1)$ constraint is nondifferentiable, the critical issue arising here is how to efficiently solve the optimization problem. To address this issue, we propose a learning algorithm from the primal dual perspective. Furthermore, we prove that, after at most $(O({1\over \epsilon } ))$ iterations, the proposed algorithm can guarantee the obtainment of an $(\epsilon)$-accurate solution. This convergence rate is better than that of the popular subgradient descent algorithm. i.e., $(O({1\over \epsilon^2} ))$. Empirical evaluation on several public benchmark data sets demonstrates the effectiveness of the proposed algorithm: 1) Compared to the methods that learn dense models, learning a ranking model with sparsity constraints significantly improves the ranking accuracies. 2) Compared to other methods for sparse learning-to-rank, the proposed algorithm tends to obtain sparser models and has superior performance gain on both ranking accuracies and training time. 3) Compared to several state-of-the-art algorithms, the ranking accuracies of the proposed algorithm are very competitive and stable.
Prediction algorithms, Optimization, Machine learning algorithms, Vectors, Computational modeling, Support vector machines, Accuracy, Fenchel duality, Learning-to-rank, sparse models, ranking algorithm
Hanjiang Lai, Yan Pan, Cong Liu, Liang Lin, Jie Wu, "Sparse Learning-to-Rank via an Efficient Primal-Dual Algorithm", IEEE Transactions on Computers, vol.62, no. 6, pp. 1221-1233, June 2013, doi:10.1109/TC.2012.62
[1] S.S. Shwartz and Y. Singer, "On the Equivalence of Weak Learnability and Linear Separability: New Relaxations and Efficient Boosting Algorithms," Machine Learning J., vol. 80, no. 2, pp. 141-163, 2010.
[2] J. Borwein and A. Lewis, Convex Analysis and Nonlinear Optimization. Springer, 2006.
[3] C.J.C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Halmilton, and G. Hullender, "Learning to Rank Using Gradient Descent," Proc. Int'l Conf. Machine Learning (ICML '05), pp. 89-96, 2005.
[4] Z. Cao, T. Qin, T.Y. Liu, M.F. Tsai, and H. Li, "Learning to Rank: From Pairwise Approach to Listwise Approach," Proc. Int'l Conf. Machine Learning (ICML '07), pp. 129-136, 2007.
[5] Y. Freund, R. Iyer, R.E. Schapire, and Y. Singer, "An Efficient Boosting Algorithm for Combining Preferences," J. Machine Learning Research, vol. 4, pp. 933-969, 2003.
[6] T. Joachims, "Optimizing Search Engines Using Clickthrough Data," Proc. ACM Conf. Knowledge Discovery and Data Mining (KDD '02), pp. 133-142, 2002.
[7] P. Li, C.J.C. Burges, and Q. Wu, "McRank: Learning to Rank Using Multiple Classification and Gradient Boosting," Proc. Neural Information Processing System (NIPS '07), pp. 845-852, 2007.
[8] Y. Yue, T. Finley, F. Radlinski, and T. Joachims, "A Support Vector Method for Optimizing Average Precision," Proc. ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '07), pp. 271-278, 2007.
[9] M. Taylor, J. Guiver, S. Robertson, and T. Minka, "SoftRank: Optimising Non-Smooth Rank Metrics," Proc. Int'l Conf. Web Search and Data Mining (WSDM '08), pp. 77-86, 2008.
[10] Z.Y. Sun, T. Qin, J. Wang, and Q. Tao, "Robust Sparse Rank Learning for Non-Smooth Ranking Measures," Proc. ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '09), pp. 259-266, 2009.
[11] D.P. Bertsekas, Nonlinear Programming, second ed. Athena Scientific, 1999.
[12] J. Xu and H. Li, "AdaRank: A Boosting Algorithm for Information Retrieval," Proc. ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '07), pp. 391-398, 2007.
[13] Y. Cao, J. Xu, T.Y. Liu, H. Li, Y. Huang, and H.W. Hon, "Adapting Ranking SVM to Document Retrieval," Proc. ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '06), pp. 186-193, 2006.
[14] T. Qin, X.D. Zhang, D.S. Wang, W.Y. Xiong, and H. Li, "Ranking with Multiple Hyperplanes," Proc. ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '07), pp. 279-286, 2007.
[15] V. Vapnik, S. Golowich, and A.J. Smola, "Support Vector Method for Function Approximation, Regression Estimation, and Signal Processing," Proc. Ann. Conf. Neural Information Processing Systems (NIPS '97), pp. 281-287, 1997.
[16] O. Chapelle and S.S. Keerthi, "Efficient Algorithms for Ranking with SVMs," Information Retrieval J., vol. 13, no. 3, pp. 201-215, 2010.
[17] J.I. Marden, Analyzing and Modeling Rank Data. Chapman & Hall, 1995.
[18] F. Xia, T.Y. Liu, J. Wang, W. Zhang, and H. Li, "Listwise Approach to Learning to Rank: Theory and Algorithm," Proc. Int'l Conf. Machine Learning (ICML '08), pp. 1192-1199, 2008.
[19] T. Joachims, "Training Linear SVMs in Linear Time," Proc. ACM Conf. Knowledge Discovery and Data Mining (KDD '06), pp. 217-226, 2006.
[20] T.Y. Liu, J. Xu, T. Qin, W. Xiong, and H. Li, "LETOR: Benchmark Data Set for Research on Learning to Rank for Information Retrieval," Proc. ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '07), pp. 129-136, 2007.
[21] R.B. Yates and B.R. Neto, Modern Information Retrieval. Addison Wesley, 1999.
[22] W.R. Hersh, C. Buckley, T.J. Leone, and D.H. Hickam, "OHSUMED: An Interactive Retrieval Evaluation and New Large Test Collection for Research," Proc. ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '94), pp. 192-201, 1994.
[23] K. Jarvelin and J. Kekalainen, "Cumulated Gain-Based Evaluation of IR Techniques," ACM Trans. Information Systems, vol. 20, no. 4, pp. 422-446, 2002.
[24] R.M. Rifkin and R.A. Lippert, "Value Regularization and Fenchel Duality," J. Machine Learning Research, vol. 8, pp. 441-479, 2007.
[25] Y. Freund and R.E. Schapire, "A Short Introduction to Boosting," J. Japanese Soc. for Artificial Intelligence, vol. 14, no. 5, pp. 771-780, 1999.
[26] T. Joachims, "Making Large-Scale SVM Learning Practical," Advances in Kernel Methods - Support Vector Learning, B. Scholkopf, C. Burges, and A. Smola eds., MIT Press, 1999.
[27] G.X. Yuan, K.W. Chang, C.J. Hsieh, and C.J. Lin, "A Comparison of Optimization Methods and Software for Large-Scale $\ell_1$ -Regularized Linear Classification," J. Machine Learning Research, vol. 11, no. 1, pp. 3183-3234, 2010.
[28] P. Tseng and S. Yun, "A Coordinate Gradient Descent Method for Nonsmooth Separable Minimization," Math. Programming, vol. 117, nos. 1/2, pp. 387-423, 2009.
[29] J. Duchi, S.S. Shwartz, Y. Singer, and T. Chandra, "Efficient Projections onto the $\ell_1$ -Ball for Learning in High Dimensions," Proc. Int'l Conf. Machine Learning (ICML '08), pp. 272-279, 2008.
[30] J. Kim, Y. Kim, and Y. Kim, "A Gradient-Based Optimization Algorithm for LASSO," J. Computational and Graphical Statistics, vol. 17, no. 4, pp. 994-1009, 2008.
[31] J. Liu, J. Chen, and J.P. Ye, "Large-Scale Sparse Logistic Regression," Proc. ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '09), pp. 547-556, 2009.
[32] D.L. Donoho and Y. Tsaig, "Fast Solution of $\ell_1$ Minimization Problems when the Solution May be Sparse," IEEE Trans. Information Theory, vol. 54, no. 11, pp. 4789-4812, Nov. 2008.
[33] F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, "Optimization with Sparsity-Inducing Penalties," Technical Report HAL 00613125-v2, HAL, 2011.
[34] W. Chen, T.Y. Liu, Y.Y. Lan, Z.M. Ma, and H. Li, "Ranking Measures and Loss Functions in Learning to Rank," Proc. Advances in Neural Information Processing Systems (NIPS '09), pp. 315-323, 2009.
71 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool