This Article 
 Bibliographic References 
 Add to: 
A Coarse-to-Fine Strategy for Multiclass Shape Detection
December 2004 (vol. 26 no. 12)
pp. 1606-1621
Multiclass shape detection, in the sense of recognizing and localizing instances from multiple shape classes, is formulated as a two-step process in which local indexing primes global interpretation. During indexing a list of instantiations (shape identities and poses) is compiled, constrained only by no missed detections at the expense of false positives. Global information, such as expected relationships among poses, is incorporated afterward to remove ambiguities. This division is motivated by computational efficiency. In addition, indexing itself is organized as a coarse-to-fine search simultaneously in class and pose. This search can be interpreted as successive approximations to likelihood ratio tests arising from a simple ("naive Bayes”) statistical model for the edge maps extracted from the original images. The key to constructing efficient "hypothesis tests” for multiple classes and poses is local ORing; in particular, spread edges provide imprecise but common and locally invariant features. Natural tradeoffs then emerge between discrimination and the pattern of spreading. These are analyzed mathematically within the model-based framework and the whole procedure is illustrated by experiments in reading license plates.

[1] Y. Amit, “A Neural Network Architecture for Visual Selection,” Neural Computation, vol. 12, pp. 1059-1082, 2000.
[2] Y. Amit, 2D Object Detection and Recognition. MIT Press, 2002.
[3] Y. Amit and D. Geman, “A Computational Model for Visual Selection,” Neural Computation, vol. 11, pp. 1691-1715, 1999.
[4] Y. Amit and M. Mascaro, “An Integrated Network for Invariant Visual Detection and Recognition,” Vision Research, 2003.
[5] H. Barrow, J.M. Tenenbaum, R.C. Boles, and H.C. Wolf, “`Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching,” Proc. Int'l Joint Conf. Artificial Intelligence, pp. 659-663, 1977.
[6] S. Belongie and J. Malik, and S. Puzicha, “Shape Matching and Object Recognition Using Shape Context,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, pp. 509-523, 2002.
[7] G. Blanchard and D. Geman, “Hierarchical Testing Designs for Pattern Recognition,” Annals of Statistics, 2005, to appear.
[8] F. Fleuret and D. Geman, “Coarse-to-Fine Face Detection,” Int'l J. Computer Vision, vol. 41, pp. 85-107, 2001.
[9] K. Fukushima and S. Miyake, “Neocognitron: A New Algorithm for Pattern Recognition Tolerant of Deformations and Shifts in Position,” Pattern Recognition, vol. 15, pp. 455-469, 1982.
[10] K. Fukushima and N. Wake, “Handwritten Alphanumeric Character Recognition by the Neocognitron,” IEEE Trans. Neural Networks, vol. 2, pp. 355-365, 1991.
[11] D.M. Gavrila, “Multi-Feature Hierarchical Template Matching Using Distance Transforms,” Proc. IEEE Int'l Conf. Pattern Recognition '98, 2003.
[12] S. Geman, K. Manbeck, and E. McClure, “Coarse-to-Fine Search and Rank-Sum Statistics in Object Recognition,” technical report, Brown Univ., 1995.
[13] S. Geman, D. Potter, and Z. Chi, “Composition Systems,” Quarterly J. Applied Math., vol. LX, pp. 707-737, 2002.
[14] W.E.L. Grimson, Object Recognition by Computer: The Role of Geometric Constraints. Cambridge, Mass.: MIT Press, 1990.
[15] H.D. Hubel, Eye, Brain, and Vision. New York: Scientific Am. Library, 1988.
[16] L. Itti, E. Koch, and C. amd Niebur, “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, pp. 1254-1260, 1998.
[17] T. Kanade and H. Schneiderman, “Probabilistic Modeling of Local Appearance and Spatial Relationships for Object Recognition,” Computer Vision and Pattern Recognition, 1998.
[18] Y. Lamdan, J.T. Schwartz, and H.J. Wolfson, “Object Recognition by Affine Invariant Matching,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 335-344, 1988.
[19] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
[20] T. Lindeberg, “Detecting Salient Blob-Like Image Structures and Their Scales with a Scale Space Primal Sketch: A Method for Focus-of-Attention,” Int'l J. Computer Vision, vol. 11, pp. 283-318, 1993.
[21] T. Lindeberg, “Edge Detection and Ridge Detection with Automatic Scale Selection,” Int'l J. Computer Vision, vol. 30, pp. 117-156, 1998.
[22] D.G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” technical report, Univ. of British Columbia, 2003.
[23] K. Mikolajczyk and C. Schmid, “A Performance Evaluation of Local Descriptors,” Proc. IEEE Computer Vision and Pattern Recognition '03, pp. 257-263, 2003.
[24] G. Nagy, “Twenty Years of Document Image Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, pp. 38-62, 2000.
[25] V. Navalpakkam and L. Itti, “Sharing Resources: Buy Attention, Get Recognition,” Proc. Intl Workshop Attention and Performance Computer Vision, 2003.
[26] C.F. Olson and D.P. Huttenlocher, “Automatic Target Recognition by Matching Oriented Edge Segments,” IEEE Trans. Image Processing, vol. 6, no. 1, pp. 103-113, Jan. 1997.
[27] C.M. Privitera and L.W. Stark, “Algorithms for Defining Visual Regions-of-Interest: Comparison with Eye Fixation,” IEEE Trans. Pattern Analysis and Machine Intelligence, pp. 970-982, vol. 22, 2000.
[28] D. Reisfeld, H. Wolfson, and Y. Yeshurun, “Context-Free Attentional Operators: The Generalized Symmetry Transform,” Int'l J. Computer Vision, vol. 14, pp. 119-130, 1995.
[29] M. Riesenhuber and T. Poggio, “Hierarchical Models of Object Recognition in Cortex,” Nature Neuroscience, vol. 2, pp. 1019-1025, 1999.
[30] A.S. Rojer and E.L. Schwartz, “A Quotient Space Hough Transform for Scpae-Variant Visual Attention,” Neural Networks for Vision and Image Processing G.A. Carpenter and S. Grossberg, eds. MIT Press, 1992.
[31] H.A. Rowley, S. Baluja, and T. Kanade, “Neural Network-Based Face Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, pp. 23-38, 1998.
[32] W. Rucklidge, “Locating Objects Using the Hausdorff distance,” Proc. Int'l Conf. Computer Vision, pp. 457-464, 1995.
[33] D.A. Socolinsky, J.D. Neuheisel, C.E. Priebe, J. De Vinney, and D. Marchette, “Fast Face Detection with a Boosted CCCD Classifier,” technical report, The Johns Hopkins Univ., 2002.
[34] A. Torralba, “Contextual Priming for Object Detection,” Int'l J. Computer Vision, vol. 53, pp. 153-167, 2003.
[35] S. Ullman, “Sequence Seeking and Couter Streams: A Computational Model for Bidirectional Information Flow in the Visual Cortex,” Cerebral Cortex, vol. 5, pp. 1-11, 1995.
[36] P. Viola and M.J. Jones, “Robust Real-Time Face Detection,” Proc. Int'l Conf. Computer Vision, vol. II, p. 747, 2001.
[37] L. Wiskott, J.-M. Fellous, N. Kruger, and C. von der Marlsburg, “Face Recognition by Elastic Bunch Graph Matching,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, pp. 775-779, 1997.

Index Terms:
Shape detection, multiple classes, statistical model, spread edges, coarse-to-fine search, online competition.
Yali Amit, Donald Geman, Xiaodong Fan, "A Coarse-to-Fine Strategy for Multiclass Shape Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 12, pp. 1606-1621, Dec. 2004, doi:10.1109/TPAMI.2004.111
Usage of this product signifies your acceptance of the Terms of Use.