This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Comparing Scores Intended for Ranking
January 2009 (vol. 21 no. 1)
pp. 21-34
Narayan L. Bhamidipati, Indian Statistical Institute, Kolkata
Sankar K. Pal, Indian Statistical Institute, Kolkata
Often ranking is performed on the the basis of some scores available for each item. The existing practice for comparing scoring functions is to compare the induced rankings by one of the multitude of rank comparison methods available in the literature. We suggest that it may be better to compare the underlying scores themselves. To this end, a generalized Kendall distance is defined, which takes into consideration not only the final ordering that the two schemes produce, but also at the spacing between pairs of scores. This is shown to be equivalent to comparing the scores after fusing with another set of scores, making it theoretically interesting. A top k version of the score comparison methodology is also provided. Experimental results clearly show the advantages score comparison has over rank comparison.

[1] F. Crestani, “Combination of Similarity Measures for Effective Spoken Document Retrieval,” J. Information Science, vol. 29, no. 2, pp. 87-96, 2003.
[2] K.M. Donald and A.F. Smeaton, “A Comparison of Score, Rank and Probability-Based Fusion Methods for Video Shot Retrieval,” Proc. Int'l Conf. Image and Video Retrieval (CIVR '03), pp. 61-70, 2003.
[3] R. Nuray and F. Can, “Automatic Ranking of Information Retrieval Systems Using Data Fusion,” Information Processing and Management, vol. 42, no. 3, pp. 595-614, 2006.
[4] M.E. Renda and U. Straccia, “Web Metasearch: Rank versus Score Based Rank Aggregation Methods,” Proc. 18th Ann. ACM Symp. Applied Computing (SAC '03), pp. 841-846, 2003.
[5] W.J. Conover, Practical Nonparametric Statistics, third ed. John Wiley & Sons, 1999.
[6] C. Dwork, R. Kumar, M. Naor, and D. Sivakumar, “Rank Aggregation Methods for the Web,” Proc. 10th Int'l World Wide Web Conf. (WWW '01), pp. 613-622, 2001.
[7] J. Bar-Ilan, M. Mat-Hassan, and M. Levene, “Methods for Comparing Rankings of Search Engine Results,” Computer Networks, vol. 50, pp. 1448-1463, 2006.
[8] A.F. Smeaton, “Independence of Contributing Retrieval Strategies in Data Fusion for Effective Information Retrieval,” Proc. 20th BCS-IRSG Colloquium, 1998.
[9] S.A. Mounir, N. Goharian, M. Mahoney, A. Salem, and O. Frieder, “Fusion of Information Retrieval Engines (FIRE),” Proc. Int'l Conf. Parallel and Distributed Processing Techniques and Applications (PDPTA), 1998.
[10] H.P. Young, “An Axiomatization of Borda's Rule,” J. Economic Theory, vol. 9, no. 1, pp. 1-91, 1974.
[11] J.H. Lee, “Analyses of Multiple Evidence Combination,” Proc. 20th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '95), pp. 267-276, 1995.
[12] M. Montague and J.A. Aslam, “Relevance Score Normalization for Metasearch,” Proc. 10th Int'l Conf. Information and Knowledge Management (CIKM '01), pp. 427-433, 2001.
[13] M. Montague, “Metasearch: Data Fusion for Document Retrieval,” PhD dissertation, Dartmouth College, 2002.
[14] W.R. Knight, “A Computer Method for Calculating Kendall's Tau with Ungrouped Data,” J. Am. Statistical Assoc., vol. 61, no. 314, pp. 436-439, 1966.
[15] R. Fagin, R. Kumar, and D. Sivakumar, “Comparing Top $k$ Lists,” Siam J. Discrete Math., vol. 17, no. 1, pp. 134-160, 2003.
[16] A. Borodin, G.O. Roberts, J.S. Rosenthal, and P. Tsaparas, “Link Analysis Ranking: Algorithms, Theory, and Experiments,” ACM Trans. Internet Technology, vol. 5, no. 1, pp. 231-297, 2005.
[17] A.R. Rao and P. Bhimasankaram, Linear Algebra. Tata-McGraw Hill, 1992.
[18] P. Berkhin, “A Survey on PageRank Computing,” Internet Math., vol. 2, no. 1, pp. 73-120, 2005.
[19] A.N. Langville and C.D. Meyer, “A Survey of Eigenvector Methods for Web Information Retrieval,” SIAM Rev., vol. 47, no. 1, pp. 135-161, 2005.
[20] S.D. Kamvar, T.H. Haveliwala, C.D. Manning, and G.H. Golub, “Extrapolation Methods for Accelerating Pagerank Computations,” Proc. 12th Int'l World Wide Web Conf. (WWW '03), pp.261-270, 2003.
[21] M. Richardson and P. Domingos, “The Intelligent Surfer: Probabilistic Combination of Link and Content Information in Pagerank,” Advances in Neural Information Processing Systems, vol. 14, pp. 1441-1448, 2002.
[22] J.H. Lee, “Combining Multiple Evidence from Different Properties of Weighting Schemes,” Proc. 18th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '95), pp.180-188, 1995.
[23] J. Hirai, S. Raghavan, A. Paepcke, and H. Garcia-Molina, “Webbase: A Repository of Web Pages,” Proc. 10th Int'l World Wide Web Conf. (WWW), 2000.
[24] T.H. Haveliwala, “Efficient Computation of Pagerank,” technical report, Stanford Univ., 1999.
[25] M.F. Porter, “An Algorithm for Suffix Stripping,” Program, vol. 14, pp. 130-137, 1980.
[26] G. Salton, A. Wong, and C.S. Yang, “A Vector Space Model for Automatic Indexing,” Comm. ACM, vol. 18, no. 11, pp. 613-620, 1975.
[27] S. Dominich, “PageRank: Quantitative Model of Interaction Information Retrieval,” Proc. 12th Int'l World Wide Web Conf. (WWW '03), pp. 13-18, 2003.
[28] B. Debroy and L. Bhandari, “Economic Freedom for the States of India,” technical report, Rajiv Gandhi Inst. Contemporary Studies, 2005.
[29] V. Ha and P. Haddawy, “Similarity of Personal Preferences: Theoretical Foundations and Empirical Analysis,” Artificial Intelligence, vol. 146, no. 2, pp. 149-173, 2003.
[30] D. Hawking, N. Craswell, P. Bailey, and K. Griffiths, “Measuring Search Engine Quality,” Information Retrieval, vol. 4, pp. 33-59, 2001.

Index Terms:
Information Technology and Systems, Correlation and regression analysis
Citation:
Narayan L. Bhamidipati, Sankar K. Pal, "Comparing Scores Intended for Ranking," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 1, pp. 21-34, Jan. 2009, doi:10.1109/TKDE.2008.111
Usage of this product signifies your acceptance of the Terms of Use.