The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.08 - August (2009 vol.31)
pp: 1486-1501
Iasonas Kokkinos , University of California at Los Angeles, Los Angeles
Petros Maragos , National Technical University of Athens, Athens
ABSTRACT
In this work, we formulate the interaction between image segmentation and object recognition in the framework of the Expectation-Maximization (EM) algorithm. We consider segmentation as the assignment of image observations to object hypotheses and phrase it as the E-step, while the M-step amounts to fitting the object models to the observations. These two tasks are performed iteratively, thereby simultaneously segmenting an image and reconstructing it in terms of objects. We model objects using Active Appearance Models (AAMs) as they capture both shape and appearance variation. During the E-step, the fidelity of the AAM predictions to the image is used to decide about assigning observations to the object. For this, we propose two top-down segmentation algorithms. The first starts with an oversegmentation of the image and then softly assigns image segments to objects, as in the common setting of EM. The second uses curve evolution to minimize a criterion derived from the variational interpretation of EM and introduces AAMs as shape priors. For the M-step, we derive AAM fitting equations that accommodate segmentation information, thereby allowing for the automated treatment of occlusions. Apart from top-down segmentation results, we provide systematic experiments on object detection that validate the merits of our joint segmentation and recognition approach.
INDEX TERMS
Image segmentation, object recognition, Expectation Maximization, Active Appearance Models, curve evolution, top--down segmentation, generative models.
CITATION
Iasonas Kokkinos, Petros Maragos, "Synergy between Object Recognition and Image Segmentation Using the Expectation-Maximization Algorithm", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 8, pp. 1486-1501, August 2009, doi:10.1109/TPAMI.2008.158
REFERENCES
[1] S. Agarwal and D. Roth, “Learning a Sparse Representation for Object Detection,” Proc. Seventh European Conf. Computer Vision, 2002.
[2] A. Barbu and S.C. Zhu, “Graph Partition by Swendsen-Wang Cuts,” Proc. Ninth Int'l Conf. Computer Vision, 2003.
[3] S. Belongie, C. Carson, H. Greenspan, and J. Malik, “Color- and Texture-Based Image Segmentation Using EM and Its Application to Content-Based Image Retrieval,” Proc. Sixth Int'l Conf. Computer Vision, 1998.
[4] S. Beucher and F. Meyer, “The Morphological Approach to Segmentation: The Watershed Transformation,” Math. Morphology in Image Processing, E.R. Dougherty, ed., pp. 433-481, Marcel Dekker, 1993.
[5] C. Bishop, “Latent Variable Models,” Learning in Graphical Models, M. Jordan, ed., MIT Press, 1998.
[6] E. Borenstein, E. Sharon, and S. Ullman, “Combining Top Down and Bottom-Up Segmentation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
[7] E. Borenstein and S. Ullman, “Class-Specific, Top-Down Segmentation,” Proc. Seventh European Conf. Computer Vision, 2002.
[8] V. Caselles, R. Kimmel, and G. Sapiro, “Geodesic Active Contours,” Int'l J. Computer Vision, vol. 22, no. 1, pp. 61-79, 1997.
[9] T. Cootes, G.J. Edwards, and C. Taylor, “Active Appearance Models,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 681-685, June 2001.
[10] D. Cremers, “Dynamical Statistical Shape Priors for Level Set Based Tracking,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 8, pp. 1262-1273, Aug. 2006.
[11] D. Cremers, N. Sochen, and C. Schnorr, “Towards Recognition Based Variational Segmentation Using Shape Priors and Dynamic Labelling,” Proc. Fourth Int'l Conf. Scale Space, 2003.
[12] P. Dayan, G. Hinton, R. Neal, and R. Zemel, “The Helmholtz Machine,” Neural Computation, vol. 7, pp. 889-904, 1995.
[13] A. Dempster, N. Laird, and D. Rudin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc., Series B, 1977.
[14] R. Fergus, P. Perona, and A. Zisserman, “Weakly Supervised Scale-Invariant Learning of Models for Visual Recognition,” Int'l J. Computer Vision, vol. 71, no. 3, pp. 273-303, 2007.
[15] V. Ferrari, T. Tuytelaars, and L.V. Gool, “Simultaneous Object Recognition and Segmentation by Image Exploration,” Proc. Eighth European Conf. Computer Vision, 2004.
[16] B. Frey and N. Jojic, “Estimating Mixture Models of Images and Inferring Spatial Transformations Using the EM Algorithm,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1999.
[17] U. Grenander, General Pattern Theory. Oxford Univ. Press, 1993.
[18] T. Jaakkola, “Tutorial on Variational Approximation Methods,” Advanced Mean Field Methods: Theory and Practice. MIT Press, 2000.
[19] R. Jacobs, “Methods for Combining Experts' Probability Assessments,” Neural Computation, no. 7, pp. 867-888, 1995.
[20] M. Jones and T. Poggio, “Multidimensional Morphable Models: A Framework for Representing and Matching Object Classes,” Int'l J. Computer Vision, vol. 29, no. 2, pp. 107-131, 1998.
[21] I. Kokkinos and P. Maragos, “An Expectation Maximization Approach to the Synergy between Object Categorization and Image Segmentation,” Proc. 10th Int'l Conf. Computer Vision, 2005.
[22] I. Kokkinos, P. Maragos, and A. Yuille, “Bottom-Up and Top-Down Object Detection Using Primal Sketch Features and Graphical Models,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[23] I. Kokkinos and A. Yuille, “Unsupervised Learning of Object Deformation Models,” Proc. 11th Int'l Conf. Computer Vision, 2007.
[24] M.P. Kumar, P.H.S. Torr, and A. Zisserman, “Obj Cut,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[25] B. Leibe, A. Leonardis, and B. Schiele, “Combined Object Categorization and Segmentation with an Implicit Shape Model,” Proc. ECCV Workshop Statistical Learning in Computer Vision, 2004.
[26] M. Leventon, O. Faugeras, and E. Grimson, “Statistical Shape Influence in Geodesic Active Contours,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2000.
[27] A. Levin and Y. Weiss, “Learning to Combine Bottom-Up and Top-Down Segmentation,” Proc. Ninth European Conf. Computer Vision, 2006.
[28] D. Marr, Vision. W.H. Freeman, 1982.
[29] D. Martin, C. Fowlkes, and J. Malik, “Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 5, pp. 530-549, May 2004.
[30] I. Matthews and S. Baker, “Active Appearance Models Revisited,” Int'l J. Computer Vision, vol. 60, no. 2, pp. 135-164, 2004.
[31] R. Milanese, H. Wechsler, S. Gil, J.M. Bost, and T. Pun, “Integration of Bottom-Up and Top-Down Cues for Visual Attention Using Non-Linear Relaxation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1994.
[32] G. Mori, X. Ren, A. Efros, and J. Malik, “Recovering Human Body Configurations: Combining Segmentation and Recognition,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
[33] D. Mumford, “Neuronal Architectures for Pattern Theoretic Problems,” Large Scale Theories of the Cortex. MIT Press, 1994.
[34] D. Mumford, “Pattern Theory: A Unifying Approach,” Perception as Bayesian Inference, 1996.
[35] R. Neal and G. Hinton, “A View of the EM Algorithm that Justifies Incremental, Sparse and Other Variants,” Learning in Graphical Models, M. Jordan, ed., 1998.
[36] A. Opelt, A. Fussenegger, and P. Auer, “Weak Hypotheses and Boosting for Generic Object Detection and Recognition,” Proc. Eighth European Conf. Computer Vision, 2004.
[37] S. Osher and J. Sethian, “Fronts Propagating with Curvature-Dependent Speed: Algorithms Based on Hamilton-Jacobi Formulations,” J. Computational Physics, vol. 79, pp. 12-49, 1988.
[38] G. Papandreou and P. Maragos, “Multigrid Geometric Active Contour Models,” IEEE Trans. Image Processing, vol. 16, no. 1, pp.229-240, 2007.
[39] N. Paragios and R. Deriche, “Geodesic Active Regions: A New Paradigm to Deal with Frame Partition Problems in Computer Vision,” J. Visual Comm. and Image Representation, vol. 13, pp. 249-268, 2002.
[40] R. Gross, I. Matthews, and S. Baker, “Active Appearance Models with Occlusion,” Image and Vision Computing, vol. 24, no. 6, pp.593-604, 2006.
[41] R. Rao and D. Ballard, “Dynamic Model of Visual Recognition Predicts Neural Response Properties in the Visual Cortex,” Vision Research, vol. 9, pp. 721-763, 1997.
[42] M. Rousson and D. Cremers, “Efficient Kernel Density Estimation of Shape and Intensity Priors for Level Set Segmentation,” Proc. Eighth Int'l Conf. Medical Image Computing and Computer Assisted Intervention, 2005.
[43] M. Rousson and N. Paragios, “Shape Priors for Level Set Representations,” Proc. Seventh European Conf. Computer Vision, 2002.
[44] J. Sethian, Level Set Methods. Cambridge Univ. Press, 1996.
[45] Z.W. Tu, X. Chen, A. Yuille, and S.C. Zhu, “Image Parsing: Unifying Segmentation, Detection, and Recognition,” Int'l J. Computer Vision, vol. 63, no. 2, pp. 113-140, 2005.
[46] Z.W. Tu and S.C. Zhu, “Image Segmentation by Data-Driven MCMC,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 657-673, May 2002.
[47] M. Turk and A. Pentland, “Eigenfaces for Recognition,” J.Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991.
[48] S. Ullman, “Sequence Seeking and Counterstreams,” Large Scale Theories of the Cortex. MIT Press, 1994.
[49] Y. Weiss and E. Adelson, “Perceptually Organized EM: A Framework for Motion Estimation That Combines Information about Form and Motion,” Proc. Fifth Int'l Conf. Computer Vision, 1995.
[50] M. Welling, M. Weber, and P. Perona, “Unsupervised Learning of Models for Recognition,” Proc. Sixth European Conf. Computer Vision, 2000.
[51] J. Winn and N. Jojic, “LOCUS: Learning Object Classes with Unsupervised Segmentation,” Proc. 10th Int'l Conf. Computer Vision, 2005.
[52] S. Xu and J. Shi, “Object Specific Figure-Ground Segregation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003.
[53] S.C. Zhu and A. Yuille, “Region Competition: Unifying Snakes, Region Growing and Bayes/MDL for Multiband Image Segmentation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 18, no. 9, pp. 884-900, Sept. 1996.
[54] S.C. Zhu, R. Zhang, and Z.W. Tu, “Integrating Top-Down/Bottom-Up for Object Recognition by Data-Driven Markov Chain Monte Carlo,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2000.
7 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool