Subscribe
Issue No.01 - January (2009 vol.21)
pp: 21-34
Narayan L. Bhamidipati , Indian Statistical Institute, Kolkata
Sankar K. Pal , Indian Statistical Institute, Kolkata
ABSTRACT
Often ranking is performed on the the basis of some scores available for each item. The existing practice for comparing scoring functions is to compare the induced rankings by one of the multitude of rank comparison methods available in the literature. We suggest that it may be better to compare the underlying scores themselves. To this end, a generalized Kendall distance is defined, which takes into consideration not only the final ordering that the two schemes produce, but also at the spacing between pairs of scores. This is shown to be equivalent to comparing the scores after fusing with another set of scores, making it theoretically interesting. A top k version of the score comparison methodology is also provided. Experimental results clearly show the advantages score comparison has over rank comparison.
INDEX TERMS
Information Technology and Systems, Correlation and regression analysis
CITATION
Narayan L. Bhamidipati, Sankar K. Pal, "Comparing Scores Intended for Ranking", IEEE Transactions on Knowledge & Data Engineering, vol.21, no. 1, pp. 21-34, January 2009, doi:10.1109/TKDE.2008.111
REFERENCES
 [1] F. Crestani, “Combination of Similarity Measures for Effective Spoken Document Retrieval,” J. Information Science, vol. 29, no. 2, pp. 87-96, 2003. [2] K.M. Donald and A.F. Smeaton, “A Comparison of Score, Rank and Probability-Based Fusion Methods for Video Shot Retrieval,” Proc. Int'l Conf. Image and Video Retrieval (CIVR '03), pp. 61-70, 2003. [3] R. Nuray and F. Can, “Automatic Ranking of Information Retrieval Systems Using Data Fusion,” Information Processing and Management, vol. 42, no. 3, pp. 595-614, 2006. [4] M.E. Renda and U. Straccia, “Web Metasearch: Rank versus Score Based Rank Aggregation Methods,” Proc. 18th Ann. ACM Symp. Applied Computing (SAC '03), pp. 841-846, 2003. [5] W.J. Conover, Practical Nonparametric Statistics, third ed. John Wiley & Sons, 1999. [6] C. Dwork, R. Kumar, M. Naor, and D. Sivakumar, “Rank Aggregation Methods for the Web,” Proc. 10th Int'l World Wide Web Conf. (WWW '01), pp. 613-622, 2001. [7] J. Bar-Ilan, M. Mat-Hassan, and M. Levene, “Methods for Comparing Rankings of Search Engine Results,” Computer Networks, vol. 50, pp. 1448-1463, 2006. [8] A.F. Smeaton, “Independence of Contributing Retrieval Strategies in Data Fusion for Effective Information Retrieval,” Proc. 20th BCS-IRSG Colloquium, 1998. [9] S.A. Mounir, N. Goharian, M. Mahoney, A. Salem, and O. Frieder, “Fusion of Information Retrieval Engines (FIRE),” Proc. Int'l Conf. Parallel and Distributed Processing Techniques and Applications (PDPTA), 1998. [10] H.P. Young, “An Axiomatization of Borda's Rule,” J. Economic Theory, vol. 9, no. 1, pp. 1-91, 1974. [11] J.H. Lee, “Analyses of Multiple Evidence Combination,” Proc. 20th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '95), pp. 267-276, 1995. [12] M. Montague and J.A. Aslam, “Relevance Score Normalization for Metasearch,” Proc. 10th Int'l Conf. Information and Knowledge Management (CIKM '01), pp. 427-433, 2001. [13] M. Montague, “Metasearch: Data Fusion for Document Retrieval,” PhD dissertation, Dartmouth College, 2002. [14] W.R. Knight, “A Computer Method for Calculating Kendall's Tau with Ungrouped Data,” J. Am. Statistical Assoc., vol. 61, no. 314, pp. 436-439, 1966. [15] R. Fagin, R. Kumar, and D. Sivakumar, “Comparing Top $k$ Lists,” Siam J. Discrete Math., vol. 17, no. 1, pp. 134-160, 2003. [16] A. Borodin, G.O. Roberts, J.S. Rosenthal, and P. Tsaparas, “Link Analysis Ranking: Algorithms, Theory, and Experiments,” ACM Trans. Internet Technology, vol. 5, no. 1, pp. 231-297, 2005. [17] A.R. Rao and P. Bhimasankaram, Linear Algebra. Tata-McGraw Hill, 1992. [18] P. Berkhin, “A Survey on PageRank Computing,” Internet Math., vol. 2, no. 1, pp. 73-120, 2005. [19] A.N. Langville and C.D. Meyer, “A Survey of Eigenvector Methods for Web Information Retrieval,” SIAM Rev., vol. 47, no. 1, pp. 135-161, 2005. [20] S.D. Kamvar, T.H. Haveliwala, C.D. Manning, and G.H. Golub, “Extrapolation Methods for Accelerating Pagerank Computations,” Proc. 12th Int'l World Wide Web Conf. (WWW '03), pp.261-270, 2003. [21] M. Richardson and P. Domingos, “The Intelligent Surfer: Probabilistic Combination of Link and Content Information in Pagerank,” Advances in Neural Information Processing Systems, vol. 14, pp. 1441-1448, 2002. [22] J.H. Lee, “Combining Multiple Evidence from Different Properties of Weighting Schemes,” Proc. 18th Ann. Int'l ACM SIGIR Conf. Research and Development in Information Retrieval (SIGIR '95), pp.180-188, 1995. [23] J. Hirai, S. Raghavan, A. Paepcke, and H. Garcia-Molina, “Webbase: A Repository of Web Pages,” Proc. 10th Int'l World Wide Web Conf. (WWW), 2000. [24] T.H. Haveliwala, “Efficient Computation of Pagerank,” technical report, Stanford Univ., 1999. [25] M.F. Porter, “An Algorithm for Suffix Stripping,” Program, vol. 14, pp. 130-137, 1980. [26] G. Salton, A. Wong, and C.S. Yang, “A Vector Space Model for Automatic Indexing,” Comm. ACM, vol. 18, no. 11, pp. 613-620, 1975. [27] S. Dominich, “PageRank: Quantitative Model of Interaction Information Retrieval,” Proc. 12th Int'l World Wide Web Conf. (WWW '03), pp. 13-18, 2003. [28] B. Debroy and L. Bhandari, “Economic Freedom for the States of India,” technical report, Rajiv Gandhi Inst. Contemporary Studies, 2005. [29] V. Ha and P. Haddawy, “Similarity of Personal Preferences: Theoretical Foundations and Empirical Analysis,” Artificial Intelligence, vol. 146, no. 2, pp. 149-173, 2003. [30] D. Hawking, N. Craswell, P. Bailey, and K. Griffiths, “Measuring Search Engine Quality,” Information Retrieval, vol. 4, pp. 33-59, 2001.