The Community for Technology Leaders
RSS Icon
Issue No.01 - January (2010 vol.32)
pp: 2-11
Hervé Jegou , INRIA Grenoble, France
Cordelia Schmid , INRIA Grenoble, France
Hedi Harzallah , INRIA Grenoble, France
Jakob Verbeek , INRIA Grenoble, France
This paper introduces the contextual dissimilarity measure, which significantly improves the accuracy of bag-of-features-based image search. Our measure takes into account the local distribution of the vectors and iteratively estimates distance update terms in the spirit of Sinkhorn's scaling algorithm, thereby modifying the neighborhood structure. Experimental results show that our approach gives significantly better results than a standard distance and outperforms the state of the art in terms of accuracy on the Nistér-Stewénius and Lola data sets. This paper also evaluates the impact of a large number of parameters, including the number of descriptors, the clustering method, the visual vocabulary size, and the distance measure. The optimal parameter choice is shown to be quite context-dependent. In particular, using a large number of descriptors is interesting only when using our dissimilarity measure. We have also evaluated two novel variants: multiple assignment and rank aggregation. They are shown to further improve accuracy at the cost of higher memory usage and lower efficiency.
Image search, image retrieval, distance regularization.
Hervé Jegou, Cordelia Schmid, Hedi Harzallah, Jakob Verbeek, "Accurate Image Search Using the Contextual Dissimilarity Measure", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.32, no. 1, pp. 2-11, January 2010, doi:10.1109/TPAMI.2008.285
[1] D. Lowe, “Distinctive Image Features from Scale Invariant Keypoints,” Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[2] K. Mikolajczyk and C. Schmid, “Scale and Affine Invariant Interest Point Detectors,” Int'l J. Computer Vision, vol. 60, no. 1, pp. 63-86, 2004.
[3] D. Nistér and H. Stewénius, “Scalable Recognition with a Vocabulary Tree,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2161-2168, 2006.
[4] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to Object Matching in Videos,” Proc. IEEE Int'l Conf. Computer Vision, pp. 1470-1477, 2003.
[5] R. Sinkhorn, “A Relationship between Arbitrary Positive Matrices and Double Stochastic Matrices,” Annals of Math. and Statistics, vol. 35, pp. 876-879, 1964.
[6] S. Chopra, R. Hadsell, and Y. LeCun, “Learning a Similarity Metric Discriminatively, with Application to Face Verification,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 539-546, 2005.
[7] A. Frome, Y. Singer, and J. Malik, “Image Retrieval and Classification Using Local Distance Functions,” Advances in Neural Information Processing Systems, pp. 417-424, MIT Press, 2007.
[8] J. Goldberger, S. Roweis, G. Hinton, and R. Salakhutdinov, “Neighbourhood Components Analysis,” Advances in Neural Information Processing Systems, pp. 513-520, MIT Press, 2005.
[9] K. Weinberger, J. Blitzer, and L. Saul, “Distance Metric Learning for Large Margin Nearest Neighbor Classification,” Advances in Neural Information Processing Systems, pp. 1473-1480, MIT Press, 2006.
[10] G. Salton and C. Buckley, “Term-Weighting Approaches in Automatic Text Retrieval,” Information Processing and Management, vol. 24, no. 5, pp. 513-523, 1988.
[11] H. Stewénius and D. Nistér, “Object Recognition Benchmark,”, 2008.
[12] R. Fagin, R. Kumar, and D. Sivakumar, “Efficient Similarity Search and Classification via Rank Aggregation,” Proc. ACM SIGMOD, pp. 301-312, 2003.
[13] J. Zobel, A. Moffat, and K. Ramamohanarao, “Inverted Files versus Signature Files for Text Indexing,” ACM Trans. Database Systems, vol. 23, no. 4, pp. 453-490, 1998.
[14] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object Retrieval with Large Vocabularies and Fast Spatial Matching,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[15] H. Jegou, H. Harzallah, and C. Schmid, “A Contextual Dissimilarity Measure for Accurate and Efficient Image Search,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[16] E. Hörster, R. Lienhart, and M. Slaney, “Image Retrieval on Large-Scale Image Databases,” Proc. ACM Int'l Conf. Image and Video Retrieval, pp. 17-24, 2007.
[17] C. Johnson, R. Masson, and M. Trosset, “On the Diagonal Scaling of Euclidean Distance Matrices to Doubly Stochastic Matrices,” Linear Algebra and Its Applications, vol. 397, no. 1, pp. 253-264, 2005.
[18] J. Tenenbaum, V. de Silva, and J. Langford, “A Global Geometric Framework for Nonlinear Dimensionality Reduction,” Science, vol. 290, no. 5500, pp. 2319-2323, 2000.
[19] S. Roweis and L. Saul, “Nonlinear Dimensionality Reduction by Locally Linear Embedding,” Science, vol. 290, no. 5500, pp. 2323-2326, 2000.
[20] G. Shakhnarovich, T. Darrell, and P. Indyk, Nearest-Neighbor Methods in Learning and Vision: Theory and Practice, chapter 3. MIT Press, Mar. 2006.
15 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool