The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.11 - November (2008 vol.30)
pp: 1877-1890
Yushi Jing , Georgia Institute of Technology, Atlanta, Google, Mountain View
Shumeet Baluja , Google, Mountain View
ABSTRACT
Because of the relative ease in understanding and processing text, commercial image-search systems often rely on techniques that are largely indistinguishable from text-search. Recently, academic studies have demonstrated the effectiveness of employing image-based features to provide alternative or additional signals. However, it remains uncertain whether such techniques will generalize to a large number of popular web queries, and whether the potential improvement to search quality warrants the additional computational cost. In this work, we cast the image-ranking problem into the task of identifying "authority" nodes on an inferred visual similarity graph and propose VisualRank to analyze the visual link structures among images. The images found to be "authorities" are chosen as those that answer the image-queries well. To understand the performance of such an approach in a real system, we conducted a series of large-scale experiments based on the task of retrieving images for 2000 of the most popular products queries. Our experimental results show significant improvement, in terms of user satisfaction and relevancy, in comparison to the most recent Google Image Search results. Maintaining modest computational cost is vital to ensuring that this procedure can be used in practice; we describe the techniques required to make this system practical for large scale deployment in commercial search engines.
INDEX TERMS
Image Processing and Computer Vision, Image/video retrieval
CITATION
Yushi Jing, Shumeet Baluja, "VisualRank: Applying PageRank to Large-Scale Image Search", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.30, no. 11, pp. 1877-1890, November 2008, doi:10.1109/TPAMI.2008.121
REFERENCES
[1] R. Datta, D. Joshi, J. Li, and J. Wang, “Image Retrieval: Ideas, Influences, and Trends of the New Age,” ACM Computing Surveys, vol. 40, no. 2, 2008.
[2] R.P.A. Pentland and S. Sclaroff, “Content-Based Manipulation of Image Databases,” Int'l J. Computer Vision, vol. 18, no. 3, pp. 233-254, 1996.
[3] A.W. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-Based Image Retrieval at the End of the Early Years,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349-1380, Dec. 2000.
[4] W.-Y. Ma and B.S. Manjunath, “A Toolbox for Navigating Large Image Databases,” Multimedia System, vol. 3, no. 7, pp. 184-198, 1999.
[5] C. Carson, S. Belongie, H. Greenspan, and J. Malik, “Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 8, pp. 1026-1038, Aug. 2002.
[6] D. Joshi, J.Z. Wang, and J. Li, “The Story Picturing Engine—A System for Automatic Text Illustration,” ACM Trans. Multimedia, Computing, Comm. and Applications, vol. 2, no. 1, pp. 68-89, 2006.
[7] R. Fergus, P. Perona, and A. Zisserman, “A Visual Category Filter for Google Images,” Proc. Eighth European Conf. Computer Vision, pp. 242-256, 2004.
[8] R. Fergus, P. Perona, and A. Zisserman, “Object Class Recognition by Unsupervised Scale-Invariant Learning,” Proc. Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 264-271, 2003.
[9] N. Friedman, D. Geiger, and M. Goldszmidt, “Bayesian Network Classifiers,” Machine Learning, vol. 29, pp. 131-163, 1997.
[10] D.G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[11] E. Xing, A. Ng, M. Jordan, and S. Russel, “Distance Metric Learning, with Applications to Clustering with Side-Information,” Proc. 15th Conf. Advances in Neural Information Processing Systems, vol. 15, pp. 450-459, 2002.
[12] K. Weinberger, J. Blitzer, and L. Saul, “Distance Metric Learning for Large Margin Nearest Neighbor Classification,” Proc. 18th Conf. Advances in Neural Information Processing Systems, vol. 18, pp.1437-1480, 2006.
[13] A. Frome, Y. Singer, F. Sha, and J. Malik, “Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification,” Proc. 11th IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007.
[14] I. Simon, N. Snavely, and S.M. Seitz, “Scene Summarization for Online Image Collections,” Proc. 12th Int'l Conf. Computer Vision, 2007.
[15] S. Uchihashi and T. Kanade, “Content-Free Image Retrieval by Combinations of Keywords and User Feedbacks,” Proc. Fifth Int'l Conf. Image and Video Retrieval, pp. 650-659, 2005.
[16] S. Baluja, R. Seth, D. Siva, Y. Jing, J. Yagnik, S. Kumar, D. Ravichandran, and M. Aly, “Video Suggestion and Discovery for YouTube: Taking Random Walks through the View Graph,” Proc. 17th Int'l World Wide Web Conf., 2008.
[17] S. Brin and L. Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Computer Networks and ISDN Systems, vol. 30, nos. 1-7, pp. 107-117, 1998.
[18] X. He, W.-Y. Ma, and H. Zhang, “Imagerank: Spectral Techniques for Structural Analysis of Image Database,” Proc. Int'l Conf. Multimedia and Expo, vol. 1, pp. 25-28, 2002.
[19] W.H. Hsu, L. Kennedy, and S. Chang, “Video Search Reranking through Random Walk over Document-Level Context Graph,” Proc. 15th Int'l Conf. Multimedia, pp. 971-980, 2007.
[20] J.M. Kleinberg, “Authoritative Sources in a Hyperlinked Environment,” J. ACM, vol. 46, no. 5, pp. 604-632, 1999.
[21] B.J. Frey and D. Dueck, “Clustering by Passing Messages between Data Points,” Science, vol. 315, pp. 972-976, 2007.
[22] R.I. Kondor and J. Lafferty, “Diffusion Kernels on Graphs and Other Discrete Structures,” Proc. 19th Int'l Conf. Machine Learning, pp. 315-322, 2002.
[23] X. Zhu, Z. Ghahramani, and J.D. Lafferty, “Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions,” Proc. 20th Int'l Conf. Machine Learning, pp. 912-919, 2003.
[24] Y. Jing, S. Baluja, and H. Rowley, “Canonical Image Selection from the Web,” Proc. Sixth Int'l Conf. Image and Video Retrieval, pp.280-287, 2007.
[25] T. Haveliwala, “Topic-Sensitive Pagerank: A Context-Sensitive Ranking Algorithm for Web Search,” IEEE Trans. Knowledge and Data Eng., vol. 15, no. 4, pp. 784-796, July/Aug. 2003.
[26] C. Harris and M. Stephens, “A Combined Corner and Edge Detector,” Proc. Fourth Alvey Vision Conf., pp. 147-151, 1988.
[27] S. Belongie, J. Malik, and J. Puzicha, “Shape Matching and Object Recognition Using Shape Contexts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 509-522, Apr. 2002.
[28] S. Lazebnik, C. Schmid, and J. Ponce, “A Sparse Texture Representation Using Affine-Invariant Regions,” Proc. Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 319-324, 2003.
[29] K. Mikolajczyk and C. Schmid, “A Performance Evaluation of Local Descriptors,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, Oct. 2005.
[30] S. Winder and M. Brown, “Learning Local Image Descriptors,” Prof. Conf. Computer Vision and Pattern Recognition, 2007.
[31] H. Bay, T. Tuytelaars, and L.V. Gool, “Surf: Speeded Up Robust Features,” Proc. Ninth European Conf. Computer Vision, pp. 404-417, 2006.
[32] D. Nistér and H. Stewénius, “Scalable Recognition with a Vocabulary Tree,” Proc. Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2161-2168, 2006.
[33] Y. Ke, R. Sukthankar, and L. Huston, “Efficient Near-Duplicate Detection and Sub-Image Retrieval,” Proc. ACM Int'l Conf. Multimedia, pp. 869-876, 2004.
[34] Y. Ke and R. Sukthankar, “PCA-SIFT: A More Distinctive Representation for Local Image Descriptors,” Proc. Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 506-513, 2004.
[35] G. Schindler, M. Brown, and R. Szeliski, “City-Scale Location Recognition,” Proc. Conf. Computer Vision and Pattern Recognition, 2007.
[36] E. Nowak and F. Jurie, “Learning Visual Similarity Measures for Comparing Never Seen Objects,” Proc. Conf. Computer Vision and Pattern Recognition, 2007.
[37] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object Retrieval with Large Vocabularies and Fast Spatial Matching,” Proc. Conf. Computer Vision and Pattern Recognition), 2007.
[38] M. Datar, N. Immorlica, P. Indyk, and V.S. Mirrokni, “Locality-Sensitive Hashing Scheme Based on p-Stable Distributions,” Proc. 20th Symp. Computational Geometry, pp. 253-262, 2004.
[39] P. Indyk, R. Motwani, P. Raghavan, and S. Vempala, “Approximate Nearest Neighbor—Towards Removing the Curse of Dimensionality,” Proc. 30th ACM Symp. Computational Theory, pp. 604-613, 1998.
[40] P. Indyk, “Stable Distributions, Pseudorandom Generators, Embeddings, and Data Stream Computation,” Proc. 41st IEEE Symp. Foundations of Computer Science, pp. 189-197, 2000.
[41] G. Salton and M.J. McGill, Introduction to Modern Information Retrieval. McGraw-Hill, 1983.
[42] G. Park, Y. Baek, and H. Lee, “Majority Based Ranking Approach in Web Image Retrieval,” Lecture Notes in Computer Science, vols.27-28, pp. 499-504, 2003.
17 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool