| | This Article | |
| |
| |
| | Share | |
| |
| |
| | Bibliographic References | |
| |
| |
| | Add to: | |
| |
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
| |
| | Search | |
| |
| |
| | |
A Probabilistic Framework for 3D Visual Object Representation
October 2009 (vol. 31 no. 10)
pp. 1790-1803
We present an object representation framework that encodes probabilistic spatial relations between 3D features and organizes these features in a hierarchy. Features at the bottom of the hierarchy are bound to local 3D descriptors. Higher level features recursively encode probabilistic spatial configurations of more elementary features. The hierarchy is implemented in a Markov network. Detection is carried out by a belief propagation algorithm, which infers the pose of high-level features from local evidence and reinforces local evidence from globally consistent knowledge, effectively producing a likelihood for the pose of the object in the detection scene. We also present a simple learning algorithm that autonomously builds hierarchies from local object descriptors. We explain how to use our framework to estimate the pose of a known object in an unknown scene. Experiments demonstrate the robustness of hierarchies to input noise, viewpoint changes, and occlusions.
[1] K. Fukushima, “Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position,” Biological Cybernetics, vol. 36, no. 4, pp. 193-202, 1980.
[2] E. Bienenstock, S. Geman, and D. Potter, “Compositionality, MDL Priors, and Object Recognition,” Advances in Neural Information Processing Systems, MIT Press, 1996.
[3] M. Riesenhuber and T. Poggio, “Hierarchical Models of Object Recognition in Cortex,” Nature Neuroscience, vol. 2, pp. 1019-1025, 1999.
[4] S.-C. Zhu, “Embedding Gestalt Laws in Markov Random Fields,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 21, no. 11, pp. 1170-1187, Nov. 1999.
[5] T.S. Lee and D. Mumford, “Hierarchical Bayesian Inference in the Visual Cortex,” J. Optical Soc. Am., vol. 7, pp. 1434-1448, 2003.
[6] Z. Tu, X. Chen, A.L. Yuille, and S.-C. Zhu, “Image Parsing: Unifying Segmentation, Detection, and Recognition,” Int'l J. Computer Vision, vol. 63, no. 2, pp. 113-140, 2005.
[7] G. Bouchard and B. Triggs, “Hierarchical Part-Based Visual Object Categorization,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 710-715, 2005.
[8] B. Epshtein and S. Ullman, “Feature Hierarchies for Object Classification,” Proc. IEEE Int'l Conf. Computer Vision, 2005.
[9] S. Fidler and A. Leonardis, “Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[10] P.F. Felzenszwalb and D.P. Huttenlocher, “Efficient Matching of Pictorial Structures,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 66-73, 2000.
[11] F. Rothganger, S. Lazebnik, C. Schmid, and J. Ponce, “3D Object Modeling and Recognition Using Local Affine-Invariant Image Descriptors and Multi-View Spatial Constraints,” Int'l J. Computer Vision, vol. 66, no. 3, pp. 231-259, 2006.
[12] A. Kushal and J. Ponce, “Modeling 3D Objects from Stereo Views and Recognizing Them in Photographs,” Proc. European Conf. Computer Vision, pp. 563-574, 2006.
[13] J. Rodgers, D. Anguelov, H.-C. Pang, and D. Koller, “Object Pose Detection in Range Scan Data,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2006.
[14] P. Yan, S.M. Khan, and M. Shah, “3D Model Based Object Class Detection in an Arbitrary View,” Proc. Int'l Conf. Computer Vision, pp. 1-6, 2007.
[15] S. Savarese and L. Fei-Fei, “3D Generic Object Categorization, Localization and Pose Estimation,” Proc. Int'l Conf. Computer Vision, pp. 1-8, Oct. 2007.
[16] J. Liebelt, C. Schmid, and K. Schertler, “Viewpoint-Independent Object Class Detection Using 3D Feature Maps,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2008.
[17] E.B. Sudderth, A.T. Ihler, W.T. Freeman, and A.S. Willsky, “Nonparametric Belief Propagation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2003.
[18] N. Krüger and F. Wörgötter, “Multi-Modal Primitives as Functional Models of Hyper-Columns and Their Use for Contextual Integration,” Proc. Int'l Symp. Brain, Vision and Artificial Intelligence, pp. 157-166, 2005.
[19] N. Pugeault, Early Cognitive Vision: Feedback Mechanisms for the Disambiguation of Early Visual Representation. Verlag Dr. Müller, 2008.
[20] F. Scalzo and J.H. Piater, “Statistical Learning of Visual Feature Hierarchies,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[21] R. Detry and J.H. Piater, “Hierarchical Integration of Local 3D Features for Probabilistic Pose Recovery,” Proc. Robot Manipulation: Sensing and Adapting to the Real World (Workshop at Robotics, Science and Systems), 2007.
[22] R. Detry, N. Pugeault, and J.H. Piater, “Probabilistic Pose Recovery Using Learned Hierarchical Object Models,” Proc. Int'l Cognitive Vision Workshop (Workshop at Sixth Int'l Conf. Vision Systems), 2008.
[23] G. Hua, M.-H. Yang, and Y. Wu, “Learning to Estimate Human Pose with Data Driven Belief Propagation,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 747-754, 2005.
[24] M. Park, Y. Liu, and R. Collins, “Efficient Mean Shift Belief Propagation for Vision Tracking,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[25] J. Zhang, Y. Liu, J. Luo, and R. Collins, “Body Localization in Still Images Using Hierarchical Models and Hybrid Search,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1536-1543, 2006.
[26] L. Sigal, Y. Zhu, D. Comaniciu, and M.J. Black, “Tracking Complex Objects Using Graphical Object Models,” Proc. First Int'l Workshop Complex Motion, pp. 223-234, 2004.
[27] L. Sigal, S. Bhatia, S. Roth, M.J. Black, and M. Isard, “Tracking Loose-Limbed People,” Proc. IEEE Computer Vision and Pattern Recognition, vol. 1, pp. I-421-I-428, 2004.
[28] E.B. Sudderth, “Graphical Models for Visual Object Recognition and Tracking,” PhD dissertation, Massachusetts Inst. of Tech nology, 2006.
[29] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988.
[30] J. Coughlan and H. Shen, “Dynamic Quantization for Belief Propagation in Sparse Spaces,” Computer Vision and Image Understanding, vol. 106, no. 1, pp. 47-58, 2007.
[31] M. Toews and T. Arbel, “Detecting and Localizing 3D Object Classes Using Viewpoint Invariant Reference Frames,” Proc. Int'l Conf. Computer Vision, pp. 1-8, 2007.
[32] I. Gordon and D.G. Lowe, “What and Where: 3D Object Recognition with Accurate Pose,” Toward Category-Level Object Recognition, pp. 67-82, Springer, 2006.
[33] J. Piater, F. Scalzo, and R. Detry, “Vision as Inference in a Hierarchical Markov Network,” Proc. 12th Int'l Conf. Cognitive and Neural Systems, 2008.
[34] J.S. Yedidia, W.T. Freeman, and Y. Weiss, “Understanding Belief Propagation and Its Generalizations,” technical report, Mitsubishi Electric Research Laboratories, 2002.
[35] M.I. Jordan and Y. Weiss, “Graphical Models: Probabilistic Inference,” The Handbook of Brain Theory and Neural Networks, M.Arbib, ed., second ed., MIT Press, 2002.
[36] B.W. Silverman, Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC, 1986.
[37] J. Kuffner, “Effective Sampling and Distance Metrics for 3D Rigid Body Path Planning,” Proc. Int'l Conf. Robotics and Automation, May 2004.
[38] K.V. Mardia and P.E. Jupp, Directional Statistics. Wiley, 1999.
[39] I.L. Dryden, “Statistical Analysis on High-Dimensional Spheres and Shape Spaces,” ArXiv Math., e-prints, Aug. 2005.
[40] C. de Granville, J. Southerland, and A.H. Fagg, “Learning Grasp Affordances Through Human Demonstration,” Proc. Int'l Conf. Development and Learning, 2006.
[41] M. Isard, “PAMPAS: Real-Valued Graphical Models for Computer Vision,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 613-620, 2003.
[42] A.T. Ihler, E.B. Sudderth, W.T. Freeman, and A.S. Willsky, “Efficient Multiscale Sampling from Products of Gaussian Mixtures,” Proc. Conf. Neural Information Processing Systems, 2003.
[43] A.T. Ihler, J.W. FisherIII, R.L. Moses, and A.S. Willsky, “Nonparametric Belief Propagation for Self-Calibration in Sensor Networks,” IEEE J. Selected Areas in Comm., vol. 23, no. 4, pp. 809-819, Apr. 2005.
[44] D. Kraft, N. Pugeault, E. Başeski, M. Popović, D. Kragic, S. Kalkan, F. Wörgötter, and N. Krüger, “Birth of the Object: Detection of Objectness and Extraction of Object Shape through Object Action Complexes,” Int'l J. Humanoid Robotics, special issue on cognitive humanoid robots, 2008.
[45] D. Kraft and N. Krüger, “Object Sequences,” http://www.mip. sdu.dk/covigsequences.html , 2009.
[46] R. Detry and J. Piater, “A Probabilistic Framework for 3D Visual Object Representation: Experimental Data,” http://intelsig.org/publicationsDetry-2009-PAMI /, 2009.
Index Terms:
Computer vision, 3D object representation, pose estimation, nonparametric belief propagation.
Citation:
Renaud Detry, Nicolas Pugeault, Justus H. Piater, "A Probabilistic Framework for 3D Visual Object Representation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 10, pp. 1790-1803, Mar. 2009, doi:10.1109/TPAMI.2009.64