This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Probabilistic Framework for 3D Visual Object Representation
October 2009 (vol. 31 no. 10)
pp. 1790-1803
Renaud Detry, University of Liège, Belgium
Nicolas Pugeault, University of Southern Denmark, Denmark
Justus H. Piater, University of Liège, Belgium
We present an object representation framework that encodes probabilistic spatial relations between 3D features and organizes these features in a hierarchy. Features at the bottom of the hierarchy are bound to local 3D descriptors. Higher level features recursively encode probabilistic spatial configurations of more elementary features. The hierarchy is implemented in a Markov network. Detection is carried out by a belief propagation algorithm, which infers the pose of high-level features from local evidence and reinforces local evidence from globally consistent knowledge, effectively producing a likelihood for the pose of the object in the detection scene. We also present a simple learning algorithm that autonomously builds hierarchies from local object descriptors. We explain how to use our framework to estimate the pose of a known object in an unknown scene. Experiments demonstrate the robustness of hierarchies to input noise, viewpoint changes, and occlusions.

[1] K. Fukushima, “Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position,” Biological Cybernetics, vol. 36, no. 4, pp. 193-202, 1980.
[2] E. Bienenstock, S. Geman, and D. Potter, “Compositionality, MDL Priors, and Object Recognition,” Advances in Neural Information Processing Systems, MIT Press, 1996.
[3] M. Riesenhuber and T. Poggio, “Hierarchical Models of Object Recognition in Cortex,” Nature Neuroscience, vol. 2, pp. 1019-1025, 1999.
[4] S.-C. Zhu, “Embedding Gestalt Laws in Markov Random Fields,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 11, pp. 1170-1187, Nov. 1999.
[5] T.S. Lee and D. Mumford, “Hierarchical Bayesian Inference in the Visual Cortex,” J. Optical Soc. Am., vol. 7, pp. 1434-1448, 2003.
[6] Z. Tu, X. Chen, A.L. Yuille, and S.-C. Zhu, “Image Parsing: Unifying Segmentation, Detection, and Recognition,” Int'l J. Computer Vision, vol. 63, no. 2, pp. 113-140, 2005.
[7] G. Bouchard and B. Triggs, “Hierarchical Part-Based Visual Object Categorization,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 710-715, 2005.
[8] B. Epshtein and S. Ullman, “Feature Hierarchies for Object Classification,” Proc. IEEE Int'l Conf. Computer Vision, 2005.
[9] S. Fidler and A. Leonardis, “Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[10] P.F. Felzenszwalb and D.P. Huttenlocher, “Efficient Matching of Pictorial Structures,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 66-73, 2000.
[11] F. Rothganger, S. Lazebnik, C. Schmid, and J. Ponce, “3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints,” Int'l J. Computer Vision, vol. 66, no. 3, pp. 231-259, 2006.
[12] A. Kushal and J. Ponce, “Modeling 3D Objects from Stereo Views and Recognizing Them in Photographs,” Proc. European Conf. Computer Vision, pp. 563-574, 2006.
[13] J. Rodgers, D. Anguelov, H.-C. Pang, and D. Koller, “Object Pose Detection in Range Scan Data,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[14] P. Yan, S.M. Khan, and M. Shah, “3D Model Based Object Class Detection in an Arbitrary View,” Proc. Int'l Conf. Computer Vision, pp. 1-6, 2007.
[15] S. Savarese and L. Fei-Fei, “3D Generic Object Categorization, Localization and Pose Estimation,” Proc. Int'l Conf. Computer Vision, pp. 1-8, Oct. 2007.
[16] J. Liebelt, C. Schmid, and K. Schertler, “Viewpoint-Independent Object Class Detection Using 3D Feature Maps,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2008.
[17] E.B. Sudderth, A.T. Ihler, W.T. Freeman, and A.S. Willsky, “Nonparametric Belief Propagation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003.
[18] N. Krüger and F. Wörgötter, “Multi-Modal Primitives as Functional Models of Hyper-Columns and Their Use for Contextual Integration,” Proc. Int'l Symp. Brain, Vision and Artificial Intelligence, pp. 157-166, 2005.
[19] N. Pugeault, Early Cognitive Vision: Feedback Mechanisms for the Disambiguation of Early Visual Representation. Verlag Dr. Müller, 2008.
[20] F. Scalzo and J.H. Piater, “Statistical Learning of Visual Feature Hierarchies,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[21] R. Detry and J.H. Piater, “Hierarchical Integration of Local 3D Features for Probabilistic Pose Recovery,” Proc. Robot Manipulation: Sensing and Adapting to the Real World (Workshop at Robotics, Science and Systems), 2007.
[22] R. Detry, N. Pugeault, and J.H. Piater, “Probabilistic Pose Recovery Using Learned Hierarchical Object Models,” Proc. Int'l Cognitive Vision Workshop (Workshop at Sixth Int'l Conf. Vision Systems), 2008.
[23] G. Hua, M.-H. Yang, and Y. Wu, “Learning to Estimate Human Pose with Data Driven Belief Propagation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 747-754, 2005.
[24] M. Park, Y. Liu, and R. Collins, “Efficient Mean Shift Belief Propagation for Vision Tracking,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[25] J. Zhang, Y. Liu, J. Luo, and R. Collins, “Body Localization in Still Images Using Hierarchical Models and Hybrid Search,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1536-1543, 2006.
[26] L. Sigal, Y. Zhu, D. Comaniciu, and M.J. Black, “Tracking Complex Objects Using Graphical Object Models,” Proc. First Int'l Workshop Complex Motion, pp. 223-234, 2004.
[27] L. Sigal, S. Bhatia, S. Roth, M.J. Black, and M. Isard, “Tracking Loose-Limbed People,” Proc. IEEE Computer Vision and Pattern Recognition, vol. 1, pp. I-421-I-428, 2004.
[28] E.B. Sudderth, “Graphical Models for Visual Object Recognition and Tracking,” PhD dissertation, Massachusetts Inst. of Tech nology, 2006.
[29] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988.
[30] J. Coughlan and H. Shen, “Dynamic Quantization for Belief Propagation in Sparse Spaces,” Computer Vision and Image Understanding, vol. 106, no. 1, pp. 47-58, 2007.
[31] M. Toews and T. Arbel, “Detecting and Localizing 3D Object Classes Using Viewpoint Invariant Reference Frames,” Proc. Int'l Conf. Computer Vision, pp. 1-8, 2007.
[32] I. Gordon and D.G. Lowe, “What and Where: 3D Object Recognition with Accurate Pose,” Toward Category-Level Object Recognition, pp. 67-82, Springer, 2006.
[33] J. Piater, F. Scalzo, and R. Detry, “Vision as Inference in a Hierarchical Markov Network,” Proc. 12th Int'l Conf. Cognitive and Neural Systems, 2008.
[34] J.S. Yedidia, W.T. Freeman, and Y. Weiss, “Understanding Belief Propagation and Its Generalizations,” technical report, Mitsubishi Electric Research Laboratories, 2002.
[35] M.I. Jordan and Y. Weiss, “Graphical Models: Probabilistic Inference,” The Handbook of Brain Theory and Neural Networks, M.Arbib, ed., second ed., MIT Press, 2002.
[36] B.W. Silverman, Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC, 1986.
[37] J. Kuffner, “Effective Sampling and Distance Metrics for 3D Rigid Body Path Planning,” Proc. Int'l Conf. Robotics and Automation, May 2004.
[38] K.V. Mardia and P.E. Jupp, Directional Statistics. Wiley, 1999.
[39] I.L. Dryden, “Statistical Analysis on High-Dimensional Spheres and Shape Spaces,” ArXiv Math., e-prints, Aug. 2005.
[40] C. de Granville, J. Southerland, and A.H. Fagg, “Learning Grasp Affordances Through Human Demonstration,” Proc. Int'l Conf. Development and Learning, 2006.
[41] M. Isard, “PAMPAS: Real-Valued Graphical Models for Computer Vision,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 613-620, 2003.
[42] A.T. Ihler, E.B. Sudderth, W.T. Freeman, and A.S. Willsky, “Efficient Multiscale Sampling from Products of Gaussian Mixtures,” Proc. Conf. Neural Information Processing Systems, 2003.
[43] A.T. Ihler, J.W. FisherIII, R.L. Moses, and A.S. Willsky, “Nonparametric Belief Propagation for Self-Calibration in Sensor Networks,” IEEE J. Selected Areas in Comm., vol. 23, no. 4, pp. 809-819, Apr. 2005.
[44] D. Kraft, N. Pugeault, E. Başeski, M. Popović, D. Kragic, S. Kalkan, F. Wörgötter, and N. Krüger, “Birth of the Object: Detection of Objectness and Extraction of Object Shape through Object Action Complexes,” Int'l J. Humanoid Robotics, special issue on cognitive humanoid robots, 2008.
[45] D. Kraft and N. Krüger, “Object Sequences,” http://www.mip. sdu.dk/covigsequences.html , 2009.
[46] R. Detry and J. Piater, “A Probabilistic Framework for 3D Visual Object Representation: Experimental Data,” http://intelsig.org/publicationsDetry-2009-PAMI /, 2009.

Index Terms:
Computer vision, 3D object representation, pose estimation, nonparametric belief propagation.
Citation:
Renaud Detry, Nicolas Pugeault, Justus H. Piater, "A Probabilistic Framework for 3D Visual Object Representation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 10, pp. 1790-1803, Oct. 2009, doi:10.1109/TPAMI.2009.64
Usage of this product signifies your acceptance of the Terms of Use.