The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - Dec. (2013 vol.35)
pp: 2916-2929
Yunchao Gong , Dept. of Comput. Sci., Univ. of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Svetlana Lazebnik , Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
Albert Gordo , Comput. Vision Center, Univ. Autonoma de Barcelona, Barcelona, Spain
Florent Perronnin , Textual Visual Pattern Anal., Xerox Res. Centre Eur., Meylan, France
ABSTRACT
This paper addresses the problem of learning similarity-preserving binary codes for efficient similarity search in large-scale image collections. We formulate this problem in terms of finding a rotation of zero-centered data so as to minimize the quantization error of mapping this data to the vertices of a zero-centered binary hypercube, and propose a simple and efficient alternating minimization algorithm to accomplish this task. This algorithm, dubbed iterative quantization (ITQ), has connections to multiclass spectral clustering and to the orthogonal Procrustes problem, and it can be used both with unsupervised data embeddings such as PCA and supervised embeddings such as canonical correlation analysis (CCA). The resulting binary codes significantly outperform several other state-of-the-art methods. We also show that further performance improvements can result from transforming the data with a nonlinear kernel mapping prior to PCA or CCA. Finally, we demonstrate an application of ITQ to learning binary attributes or "classemes" on the ImageNet data set.
INDEX TERMS
Quantization, Binary codes, Principal component analysis, Encoding, Linear programming, Iterative methods,quantization, Large-scale image search, binary codes, hashing
CITATION
Yunchao Gong, Svetlana Lazebnik, Albert Gordo, Florent Perronnin, "Iterative Quantization: A Procrustean Approach to Learning Binary Codes for Large-Scale Image Retrieval", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 12, pp. 2916-2929, Dec. 2013, doi:10.1109/TPAMI.2012.193
REFERENCES
[1] A. Andoni and P. Indyk, "Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions," Comm. ACM, vol. 51, pp. 117-122, 2008.
[2] A. Bergamo, L. Torresani, and A. Fitzgibbon, "Picodes: Learning a Compact Code for Novel-Category Recognition," Proc. Conf. Neural Information Processing Systems, 2011.
[3] M.B. Blaschko and C.H. Lampert, "Correlational Spectral Clustering," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[4] P. Brasnett and M. Bober, "Fast and Robust Image Identification," Proc. 19th Int'l Conf. Pattern Recognition, 2008.
[5] M. Bronstein, A. Bronstein, N. Paragios, and F. Michel, "Data Fusion through Cross-Modality Metric Learning Using Similarity-Sensitive Hashing," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[6] O. Chapelle, J. Weston, and B. Schoelkopf, "Cluster Kernels for Semi-Supervised Learning," Proc. Conf. Neural Information Processing Systems, 2002.
[7] O. Chum and J. Matas, "Large Scale Discovery of Spatilly Related Images," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 2, pp. 371-377, Feb. 2010.
[8] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A Large-Scale Hierarchical Image Database," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[9] R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin, "Liblinear: A Library for Large Linear Classification," J. Machine Learning Research, vol. 9, pp. 1871-1874, 2008.
[10] R. Fergus, A. Torralba, and Y. Weiss, "Semi-Supervised Learning in Gigantic Image Collections," Proc. Conf. Neural Information Processing Systems, 2009.
[11] D.P. Foster, R. Johnson, S.M. Kakade, and T. Zhang, "Multi-View Dimensionality Reduction via Canonical Correlation Analysis," technical report, Rutgers Univ., 2010.
[12] J.-M. Frahm, P. Georgel, D. Gallup, T. Johnson, R. Raguram, C. Wu, Y.-H. Jen, E. Dunn, B. Clipp, S. Lazebnik, and M. Pollefeys, "Building Rome on a Cloudless Day," Proc. 11th European Conf. Computer Vision, 2010.
[13] Y. Gong and S. Lazebnik, "Iterative Quantization: A Procrustean Approach to Learning Binary Codes," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[14] A. Gordo and F. Perronnin, "Asymmetric Distances for Binary Embeddings," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[15] J. He, R. Radhakrishnan, S.-F. Chang, and C. Bauer, "Compact Hashing with Joint Optimization of Search Accuracy and Time," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[16] H. Hotelling, "Relations between Two Sets of Variables," Biometrika, vol. 28, pp. 312-377, 1936.
[17] H. Jégou, M. Douze, and C. Schmid, "Hamming Embedding and Weak Geometric Consistency for Large-Scale Image Search," Proc. 10th European Conf. Computer Vision, 2008.
[18] H. Jégou, M. Douze, and C. Schmid, "Product Quantization for Nearest Neighbor Search," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 117-128, Jan. 2010.
[19] H. Jégou, M. Douze, C. Schmid, and P. Perez, "Aggregating Local Descriptors into a Compact Image Representation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[20] A. Joly and O. Buisson, "Random Maximum Margin Hashing," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[21] A. Krizhevsky, "Learning Multiple Layers of Features from Tiny Images," technical report, Univ. of Toronto, 2009.
[22] B. Kulis and K. Grauman, "Kernelized Locality-Sensitive Hashing for Scalable Image Search," Proc. 12th IEEE Int'l Conf. Computer Vision, 2009.
[23] R.-S. Lin, D. Ross, and J. Yagnik, "Spec Hashing: Similarity Preserving Algorithm for Entropy-Based Coding," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[24] W. Liu, S. Kumar, and S.-F. Chang, "Hashing with Graphs," Proc. 28th Int'l Conf. Machine Learning, 2011.
[25] W. Liu, J. Wang, R. Ji, Y.-G. Jiang, and S.-F. Chang, "Supervised Hashing with Kernels," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2012.
[26] D.G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, pp. 91-110, 2004.
[27] S. Maji, A.C. Berg, and J. Malik, "Classification Using Intersection Kernel Support Vector Machines is Efficient," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[28] D. Nister and H. Stewenius, "Scalable Recognition with a Vocabulary Tree," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[29] M. Norouzi and D.J. Fleet, "Minimal Loss Hashing for Compact Binary Codes," Proc. Int'l Conf. Machine Learning, 2011.
[30] A. Oliva and A. Torralba, "Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope," Int'l J. Computer Vision, vol. 42, pp. 145-175, 2001.
[31] L. Pauleve, H. Jegou, and L. Amsaleg, "Locality Sensitive Hashing: A Comparison of Hash Function Types and Querying Mechanisms," Pattern Recognition Letters, vol. 11, pp. 1348-1358, 2010.
[32] F. Perronnin and C.R. Dance, "Fisher Kernels on Visual Vocabularies for Image Categorization," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[33] F. Perronnin, Y. Liu, J. Sánchez, and H. Poirier, "Large-Scale Image Retrieval with Compressed Fisher Vectors," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 3384-3391, 2010.
[34] F. Perronnin, J. Sanchez, and Y. Liu, "Large-Scale Image Categorization with Explicit Data Embedding," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[35] F. Perronnin, J. Sanchez, and T. Mensink, "Improving the Fisher Kernel for Large-Scale Image Classification," Proc. 11th European Conf. Computer Vision, 2010.
[36] M. Raginsky and S. Lazebnik, "Locality Sensitive Binary Codes from Sift-Invariant Kernels," Proc. Conf. Neural Information Processing Systems, 2009.
[37] A. Rahimi and B. Recht, "Random Features for Large-Scale Kernel Machines," Proc. Conf. Neural Information Processing Systems, 2007.
[38] N. Rasiwasia, P. Moreno, and N. Vasconcelos, "Bridging the Gap: Query by Semantic Example," IEEE Trans. Multimedia, vol. 9, no. 5, pp. 923-938, Aug. 2007.
[39] R. Salakhutdinov and G. Hinton, "Semantic Hashing," Int'l J. Approximate Reasoning, vol. 50, pp. 969-978, 2009.
[40] B. Schölkopf, A. Smola, and K.-R. Müller, "Kernel Principal Component Analysis," Proc. Int'l Conf. Artificial Neural Networks, 1997.
[41] P. Schonemann, "A Generalized Solution of the Orthogonal Procrustes Problem," Psychometrika, 31, pp. 1-10, 1966.
[42] Nearest-Neighbors methods in Learning and Vision: Theory and Practice, G. Shakhnarovich, T. Darrell, and P. Indyk, eds. MIT Press, 2006.
[43] C. Strecha, A.M. Bronstein, M.M. Bronstein, and P. Fua, "Ldahash: Improved Matching with Smaller Descriptors," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 5, pp. 815-830, 2010.
[44] A. Torralba, R. Fergus, and W. Freenman, "80 Million Tiny Images: A Large Data Set for Non-Parametric Object and Scene Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1958-1970, Nov. 2008.
[45] A. Torralba, R. Fergus, and Y. Weiss, "Small Codes and Large Image Databases for Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[46] L. Torresani, M. Szummer, and A. Fitzgibbon, "Efficient Object Category Recognition Using Classemes," Proc. European Conf. Computer Vision, 2010.
[47] A. Vedaldi and A. Zisserman, "Efficient Additive Kernels via Explicit Feature Maps," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[48] G. Wang, D. Hoiem, and D. Forsyth, "Learning Image Similarity from Flickr Groups Using Stochastic Intersection Kernel Machines," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[49] J. Wang, S. Kumar, and S.-F. Chang, "Semi-Supervised Hashing for Scalable Image Retrieval," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[50] J. Wang, S. Kumar, and S.-F. Chang, "Sequential Projection Learning for Hashing with Compact Codes," Proc. Int'l Conf. Machine Learning, 2010.
[51] Y. Weiss, A. Torralba, and R. Fergus, "Spectral Hashing," Proc. Conf. Neural Information Processing Systems, 2008.
[52] S.X. Yu and J. Shi, "Multiclass Spectral Clustering," Proc. 12th IEEE Int'l Conf. Computer Vision, 2003.
28 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool