The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - January (2011 vol.33)
pp: 43-57
Matthew Brown , Ecole Polytechnique Fédérale de Lausanne, Lausanne
Gang Hua , Nokia Research Center Hollywood, Santa Monica
Simon Winder , Microsoft Research Redmond, Redmond
ABSTRACT
In this paper, we explore methods for learning local image descriptors from training data. We describe a set of building blocks for constructing descriptors which can be combined together and jointly optimized so as to minimize the error of a nearest-neighbor classifier. We consider both linear and nonlinear transforms with dimensionality reduction, and make use of discriminant learning techniques such as Linear Discriminant Analysis (LDA) and Powell minimization to solve for the parameters. Using these techniques, we obtain descriptors that exceed state-of-the-art performance with low dimensionality. In addition to new experiments and recommendations for descriptor learning, we are also making available a new and realistic ground truth data set based on multiview stereo data.
INDEX TERMS
Image descriptors, local features, discriminative learning, SIFT.
CITATION
Matthew Brown, Gang Hua, Simon Winder, "Discriminative Learning of Local Image Descriptors", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.33, no. 1, pp. 43-57, January 2011, doi:10.1109/TPAMI.2010.54
REFERENCES
[1] R. Szeliski, "Image Alignment and Stitching: A Tutorial," Technical Report MSR-TR-2004-92, Microsoft Research, Dec. 2004.
[2] M. Brown and D. Lowe, "Automatic Panoramic Image Stitching Using Invariant Features," Int'l J. Computer Vision, vol. 74, no. 1, pp. 59-73, 2007.
[3] M. Pollefeys, L.V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch, "Visual Modeling with a Hand-Held Camera," Int'l J. Computer Vision, vol. 59, no. 3, pp. 207-232, 2004.
[4] N. Snavely, S.M. Seitz, and R. Szeliski, "Photo Tourism: Exploring Photo Collections in 3D," Proc. ACM SIGGRAPH Conf., pp. 835-846, 2006.
[5] D. Nistér and H. Stewénius, "Scalable Recognition with a Vocabulary Tree," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2161-2168, June 2006.
[6] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, "Object Retrieval with Large Vocabularies and Fast Spatial Matching," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[7] J. Zhang, M. Marszalek, S. Lazebnik, and C. Schmid, "Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study," Int'l J. Computer Vision, vol. 73, no. 2, pp. 213-238, June 2007.
[8] K. Grauman and T. Darrell, "The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features," Proc. IEEE Int'l Conf. Computer Vision, Oct. 2005.
[9] R. Fergus, P. Perona, and A. Zisserman, "Object Class Recognition by Unsupervised Scale-Invariant Learning," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003.
[10] K. Mikolajczyk and C. Schmid, "Scale and Affine Invariant Interest Point Detectors," Int'l J. Computer Vision, vol. 1, no. 60, pp. 63-86, 2004.
[11] K. Mikolajczyk and C. Schmid, "A Performance Evaluation of Local Descriptors," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1615-1630, Oct. 2005.
[12] V. Lepetit and P. Fua, "Keypoint Recognition Using Randomized Trees," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 9, pp. 1465-1479, Sept. 2006.
[13] J. Shotton, M. Johnson, and R. Cipolla, "Semantic Texton Forests for Image Categorization and Segmentation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2008.
[14] B. Babenko, P. Dollar, and S. Belongie, "Task Specific Local Region Matching," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[15] J.M.D. Martin and C. Fowlkes, "Learning to Detect Natural Image Boundaries Using Local Brightness, Color and Texture Cues," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 5, pp. 530-549, May 2004.
[16] C. Schmid and R. Mohr, "Local Grayvalue Invariants for Image Retrieval," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 5, pp. 530-535, May 1997.
[17] C. Rothwell, A. Zisserman, D. Forsyth, and J. Mundy, "Canonical Frames for Planar Object Recognition," Proc. European Conf. Computer Vision, pp. 757-772, 1992.
[18] D. Lowe, "Object Recognition from Local Scale-Invariant Features," Proc. IEEE Int'l Conf. Computer Vision, pp. 1150-1157, Sept. 1999.
[19] D. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[20] D. Hubel and T. Wiesel, "Brain Mechanisms of Vision," Scientific Am., vol. 241, pp. 150-162, Sept. 1979.
[21] S. Belongie, J. Malik, and J. Puzicha, "Shape Context: A New Descriptor for Shape Matching and Object Recognition," Advances in Neural Information Processing Systems, MIT Press, 2000.
[22] A. Berg and J. Malik, "Geometric Blur and Template Matching," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. I, pp. 607-614, 2001.
[23] Y. Ke and R. Sukthankar, "PCA-SIFT: A More Distinctive Representation for Local Image Descriptors," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 506-513, July 2004.
[24] P. Belhumeur, J. Hespanha, and D. Kriegman, "Eigenfaces vs Fisherfaces: Recognition Using Class Specific Linear Projection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, July 1997.
[25] X. He, S. Yan, Y. Hu, P. Niyogi, and H. Zhang, "Face Recognition Using Laplacianfaces," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 328-340, Mar. 2005.
[26] H. Chen, H. Chang, and T. Liu, "Local Discriminant Embedding and Its Variants," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 846-853, June 2005.
[27] J. Duchene and S. Leclercq, "An Optimal Transformation for Discriminant and Principle Component Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 10, no. 6, pp. 978-983, Nov. 1988.
[28] P. Moreels and P. Perona, "Evaluation of Feature Detectors and Descriptors Based on 3D Objects," Proc. IEEE Int'l Conf. Computer Vision, vol. 1, pp. 800-807, 2005.
[29] M. Goesele, S. Seitz, and B. Curless, "Multi-View Stereo Revisited," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2006.
[30] M. Goesele, N. Snavely, B. Curless, H. Hoppe, and S. Seitz, "Multi-View Stereo for Community Photo Collections," Proc. IEEE Int'l Conf. Computer Vision, Oct. 2007.
[31] S. Winder and M. Brown, "Learning Local Image Descriptors," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2007.
[32] W. Press, B. Flannery, S. Teukolsky, and W. Vetterling, Numerical Recipes in C: The Art of Scientific Computing, second ed. Cambridge Univ. Press, 1992.
[33] M. Brown and D. Lowe, "Unsupervised 3D Object Recognition and Reconstruction in Unordered Data Sets," Proc. Fifth Int'l Conf. 3D Imaging and Modelling, 2005.
[34] N. Snavely, S. Seitz, and R. Szeliski, "Modeling the World from Internet Photo Collections," Int'l J. Computer Vision, vol. 80, no. 2, pp. 189-210, 2008.
[35] G. Hua, M. Brown, and S. Winder, "Discriminant Embedding for Local Image Descriptors," Proc. IEEE Int'l Conf. Computer Vision, Oct. 2007.
[36] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio, "Object Recognition with Cortex-Like Mechanisms," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 3, pp. 411-426, Mar. 2007.
[37] W.T. Freeman and E.H. Adelson, "The Design and Use of Steerable Filters," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 9, pp. 891-906, Sept. 1991.
[38] E. Tola, V. Lepetit, and P. Fua, "A Fast Local Descriptor for Dense Matching," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2008.
[39] K. Mikolajczyk and J. Matas, "Improving Descriptors for Fast Tree Matching by Optimal Linear Projection," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[40] D. Cai, X. He, J. Han, and H.-J. Zhang, "Orthogonal Laplacianfaces for Face Recognition," IEEE Trans. Image Processing, vol. 15, no. 11, pp. 3608-3614, Nov. 2006.
[41] G. Hua, P. Viola, and S. Druker, "Face Recognition Using Discriminatively Trained Orthogonal Rank One Tensor Projections," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2007.
[42] C. Strecha, W. von Hansen, L.V. Gool, P. Fua, and U. Thoennessen, "On Benchmarking Camera Calibration and Multi-View Stereo for High Resolution Imagery," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2008.
[43] S. Winder, G. Hua, and M. Brown, "Picking the Best Daisy," Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2009.
5 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool