The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - December (2009 vol.31)
pp: 2129-2142
Christoph H. Lampert , Max Planck Institute for Biological Cybernetics, Tuebingen
Matthew B. Blaschko , University of Oxford, Oxford
Thomas Hofmann , Google Inc., Zurich
ABSTRACT
Most successful object recognition systems rely on binary classification, deciding only if an object is present or not, but not providing information on the actual object location. To estimate the object's location, one can take a sliding window approach, but this strongly increases the computational cost because the classifier or similarity function has to be evaluated over a large set of candidate subwindows. In this paper, we propose a simple yet powerful branch and bound scheme that allows efficient maximization of a large class of quality functions over all possible subimages. It converges to a globally optimal solution typically in linear or even sublinear time, in contrast to the quadratic scaling of exhaustive or sliding window search. We show how our method is applicable to different object detection and image retrieval scenarios. The achieved speedup allows the use of classifiers for localization that formerly were considered too slow for this task, such as SVMs with a spatial pyramid kernel or nearest-neighbor classifiers based on the \chi^2 distance. We demonstrate state-of-the-art localization performance of the resulting systems on the UIUC Cars data set, the PASCAL VOC 2006 data set, and in the PASCAL VOC 2007 competition.
INDEX TERMS
Object localization, sliding window, global optimization, branch and bound.
CITATION
Christoph H. Lampert, Matthew B. Blaschko, Thomas Hofmann, "Efficient Subwindow Search: A Branch and Bound Framework for Object Localization", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 12, pp. 2129-2142, December 2009, doi:10.1109/TPAMI.2009.144
REFERENCES
[1] T.M. Breuel, “Fast Recognition Using Adaptive Subdivisions of Transformation Space,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 445-451, 1992.
[2] D.P. Huttenlocher, G.A. Klanderman, and W.A. Rucklidge, “Comparing Images Using the Hausdorff Distance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 9, pp. 850-863, Sept. 1993.
[3] M. Hagedoorn and R.C. Veltkamp, “Reliable and Efficient Pattern Matching Using an Affine Invariant Metric,” Int'l J. Computer Vision, vol. 31, nos. 2/3, pp. 203-225, 1999.
[4] D.M. Mount, N.S. Netanyahu, and J.L. Moigne, “Efficient Algorithms for Robust Feature Matching,” Pattern Recognition, vol. 32, no. 1, pp. 17-38, 1999.
[5] F. Jurie, “Solution of the Simultaneous Pose and Correspondence Problem Using Gaussian Error Model,” Computer Vision and Image Understanding, vol. 73, no. 3, pp. 357-373, 1999.
[6] C.F. Olson, “Locating Geometric Primitives by Pruning the Parameter Space,” Pattern Recognition, vol. 34, no. 6, pp. 1247-1256, 2001.
[7] C.H. Lampert, M.B. Blaschko, and T. Hofmann, “Beyond Sliding Windows: Object Localization by Efficient Subwindow Search,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[8] P. Viola and M. Jones, “Rapid Object Detection Using a Boosted Cascade of Simple Features,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 511-518, 2001.
[9] H.A. Rowley, S. Baluja, and T. Kanade, “Human Face Detection in Visual Scenes,” Advances in Neural Information Processing Systems, pp. 875-881, MIT Press, 1996.
[10] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 886-893, 2005.
[11] V. Ferrari, L. Fevrier, F. Jurie, and C. Schmid, “Groups of Adjacent Contour Segments for Object Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 1, pp. 36-51, Jan. 2008.
[12] O. Chum and A. Zisserman, “An Exemplar Model for Learning Object Classes,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[13] E.L. Lawler and D.E. Wood, “Branch-and-Bound Methods: A Survey,” Operations Research, vol. 14, no. 4, pp. 699-719, 1966.
[14] B.L. Fox, J.K. Lenstra, A.H.G.R. Kan, and L.E. Schrage, “Branching from the Largest Upper Bound: Folklore and Facts,” European J. Operational Research, vol. 2, pp. 191-194, 1978.
[15] P.A. Viola and M.J. Jones, “Robust Real-Time Face Detection,” Int'l J. Computer Vision, vol. 57, no. 2, pp. 137-154, 2004.
[16] S. Lazebnik, C. Schmid, and J. Ponce, “Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2169-2178, 2006.
[17] K. Grauman and T. Darrell, “The Pyramid Match Kernel: Efficient Learning with Sets of Features,” J. Machine Learning Research, vol. 8, pp. 725-760, Apr. 2007.
[18] F. Schaffalitzky and A. Zisserman, “Viewpoint Invariant Texture Matching and Wide Baseline Stereo,” Proc. Int'l Conf. Computer Vision, pp. 636-643, 2001.
[19] S. Boughorbel, J.-P. Tarel, and N. Boujemaa, “Generalized Histogram Intersection Kernel for Image Recognition,” Proc. Int'l Conf. Image Processing, pp. 161-164, 2005.
[20] M.J. Swain and D.H. Ballard, “Color Indexing,” Int'l J. Computer Vision, vol. 7, pp. 11-32, 1991.
[21] A. Barla, F. Odone, and A. Verri, “Histogram Intersection Kernel for Image Classification,” Proc. Int'l Conf. Image Processing, pp. 513-516, 2003.
[22] F.M. Porikli, “Integral Histogram: A Fast Way to Extract Histograms in Cartesian Spaces,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 829-836, 2005.
[23] S. Maji, A.C. Berg, and J. Malik, “Classification Using Intersection Kernel Support Vector Machines Is Efficient,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[24] R.E. Moore, Interval Analysis. Prentice Hall, 1966.
[25] H. Ju and V. Emden, “Interval Arithmetic: From Principles to Implementation,” J. ACM, vol. 48, pp. 1038-1068, 2001.
[26] T.M. Breuel, “On the Use of Interval Arithmetic in Geometric Branch and Bound Algorithms,” Pattern Recognition Letters, vol. 24, nos. 9/10, pp. 1375-1384, June 2003.
[27] T. Yeh and T. Darrell, “Fast Concurrent Object Localization and Recognition,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[28] B. Gendron and T.G. Crainic, “Parallel Branch-and-Bound Algorithms: Survey and Synthesis,” Operations Research, vol. 42, pp. 1042-1066, 1994.
[29] H. Bay, T. Tuytelaars, and L.J.V. Gool, “SURF: Speeded Up Robust Features,” Proc. European Conf. Computer Vision, pp. 404-417, 2006.
[30] I. Laptev, “Improving Object Detection with Boosted Histograms,” Image and Vision Computing, vol. 27, no. 5, pp. 535-544, 2009.
[31] M.B. Blaschko and C.H. Lampert, “Learning to Localize Objects by Structured Output Regression,” Proc. European Conf. Computer Vision, 2008.
[32] M. Everingham, A. Zisserman, C. Williams, and L.V. Gool, “The PASCAL Visual Object Classes Challenge 2006 (VOC2006) Results,” http://www.pascal-network.org/challenges/ VOC/voc2006results.pdf, 2006.
[33] V. Viitaniemi and J. Laaksonen, “Techniques for Still Image Scene Classification and Object Detection,” Proc. Int'l Conf. Artificial Neural Networks, vol. 2, pp. 35-44, 2006.
[34] S. Agarwal, A. Awan, and D. Roth, “Learning to Detect Objects in Images via a Sparse, Part-Based Representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1475-1490, Nov. 2004.
[35] R. Fergus, P. Perona, and A. Zisserman, “Object Class Recognition by Unsupervised Scale-Invariant Learning,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 264-271, 2003.
[36] B. Leibe, A. Leonardis, and B. Schiele, “Robust Object Detection with Interleaved Categorization and Segmentation,” Int'l J. Computer Vision, vol. 77, nos. 1-3, pp. 259-289, May 2008.
[37] M. Fritz, B. Leibe, B. Caputo, and B. Schiele, “Integrating Representative and Discriminative Models for Object Category Detection,” Proc. Int'l Conf. Computer Vision, pp. 1363-1370, 2005.
[38] J. Mutch and D.G. Lowe, “Multiclass Object Recognition with Sparse, Localized Features,” Proc. IEEE Conf. Computer Vision and Pattern Recognition , pp. 11-18, 2006.
[39] N.-S. Chang and K.-S. Fu, “Query-by-Pictorial-Example,” IEEE Trans. Software Eng., vol. 6, no. 6, pp. 519-524, Nov. 1980.
[40] J. Sivic and A. Zisserman, “Video Google: A Text Retrieval Approach to Object Matching in Videos,” Proc. Int'l Conf. Computer Vision, pp. 1470-1477, 2003.
[41] B. Schiele and J.L. Crowley, “Object Recognition Using Multidimensional Receptive Field Histograms,” Proc. European Conf. Computer Vision, vol. I, pp. 610-619, 1996.
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool