The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2014 vol.36)
pp: 564-576
Simon Hadfield , Centre for Vision, Speech & Signal Process., Univ. of Surrey, Guildford, UK
Richard Bowden , Centre for Vision, Speech & Signal Process., Univ. of Surrey, Guildford, UK
ABSTRACT
In this paper, an algorithm is presented for estimating scene flow, which is a richer, 3D analog of optical flow. The approach operates orders of magnitude faster than alternative techniques and is well suited to further performance gains through parallelized implementation. The algorithm employs multiple hypotheses to deal with motion ambiguities, rather than the traditional smoothness constraints, removing oversmoothing errors and providing significant performance improvements on benchmark data, over the previous state of the art. The approach is flexible and capable of operating with any combination of appearance and/or depth sensors, in any setup, simultaneously estimating the structure and motion if necessary. Additionally, the algorithm propagates information over time to resolve ambiguities, rather than performing an isolated estimation at each frame, as in contemporary approaches. Approaches to smoothing the motion field without sacrificing the benefits of multiple hypotheses are explored, and a probabilistic approach to occlusion estimation is demonstrated, leading to 10 and 15 percent improved performance, respectively. Finally, a data-driven tracking approach is described, and used to estimate the 3D trajectories of hands during sign language, without the need to model complex appearance variations at each viewpoint.
INDEX TERMS
Estimation, Smoothing methods, Equations, Sociology, Statistics, Optical sensors,motion segmentation, Scene flow, scene particles, motion estimation, 3D, 3D motion, particle, particle filter, optical flow, hand tracking, sign language, tracking, occlusion estimation, probabilistic occlusion, occlusion, bilateral filter, 3D tracking
CITATION
Simon Hadfield, Richard Bowden, "Scene Particles: Unregularized Particle-Based Scene Flow Estimation", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.36, no. 3, pp. 564-576, March 2014, doi:10.1109/TPAMI.2013.162
REFERENCES
[1] H. Yan and T. Tjahjadi, "Optical Flow Estimation and Segmentation through Surface Fitting and Robust Statistics," Proc. IEEE Int'l Conf. Systems, Man, and Cybernetics (SMC '03), 2003.
[2] S. Hadfield and R. Bowden, "Go with the Flow: Hand Trajectories in 3D via Clustered Scene Flow," Proc. Ninth Int'l Conf. Image Analysis and Recognition (ICIAR '12), 2012.
[3] R. Cutler and M. Turk, "View-Based Interpretation of Real-Time Optical Flow for Gesture Recognition," Proc. Third IEEE Int'l Conf. Face and Gesture Recognition (FG '98), 1998.
[4] B. Fransen, E. Herbst, A. Harrison, W. Adams, and J. Trafton, "Real-Time Face and Object Tracking," Proc. IEEE/RSJ Int'l Conf. Intelligent Robots and Systems, 2009.
[5] R. Cipolla, Y. Okamoto, and Y. Kuno, "Robust Structure from Motion Using Motion Parallax," Proc. Fourth IEEE Int'l Conf. Computer Vision (ICCV), 1993.
[6] A. Davison, "Real-Time Simultaneous Localisation and Mapping with a Single Camera," Proc. Ninth IEEE Int'l Conf. Computer Vision (ICCV), 2003.
[7] R. Castle, G. Klein, and D.W. Murray, "Video-Rate Localization in Multiple Maps for Wearable Augmented Reality," Proc. 12th IEEE Int'l Symp. Wearable Computers, pp. 15-22, 2008.
[8] S. Vedula, S. Baker, P. Rander, R. Collins, and T. Kanade, "Three-Dimensional Scene Flow," Proc. IEEE Int'l Conf. Computer Vision (ICCV), 1999.
[9] S. Vedula, S. Baker, S. Seitz, and T. Kanade, "Shape and Motion Carving in 6D," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2000.
[10] S. Vedula, P. Rander, R. Collins, and T. Kanade, "Three-Dimensional Scene Flow," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 475-480, Mar. 2005.
[11] H. Spies, B. Jahne, and J. Barron, "Range Flow Estimation," Computer Vision and Image Understanding, vol. 85, no. 3, pp. 209-231, Mar. 2002.
[12] T. Schuchert, T. Aach, and H. Scharr, "Range Flow for Varying Illumination," Proc. 10th European Conf. Computer Vision (ECCV), 2008.
[13] T. Schuchert, T. Aach, and H. Scharr, "Range Flow in Varying Illumination: Algorithms and Comparisons," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1646-1658, Sept. 2010.
[14] T. Lukins and R. Fisher, "Colour Constrained 4D Flow," Proc. British Machine Vision Conference (BMVC), 2005.
[15] A. Wedel, C. Rabe, T. Vaudrey, T. Brox, U. Franke, and D. Cremers, "Efficient Dense Scene Flow from Sparse or Dense Stereo Data," Proc. 10th European Conf. Computer Vision (ECCV), 2008.
[16] A. Wedel, T. Brox, T. Vaudrey, C. Rabe, U. Franke, and D. Cremers, "Stereoscopic Scene Flow Computation for 3D Motion Understanding," Int'l J. Computer Vision, vol. 95, pp. 29-51, 2011.
[17] J. Pons, R. Keriven, O. Faugeras, and G. Hermosilo, "Variational Stereovision and 3D Sceneflow Estimation with Statistic Similarity Measure," Proc. Ninth IEEE Int'l Conf. Computer Vision (ICCV), 2003.
[18] J. Pons, R. Keriven, and O. Faugeras, "Multiview Stereo Reconstruction and Scene Flow Estimation with a Global Image-Based Matching Score," Int'l J. Computer Vision, vol. 72, pp. 179-193, 2007.
[19] C. Rabe, T. Müller, A. Wedel, and U. Franke, "Dense, Robust and Accurate Motion Field Estimation from Stereo Image Sequences in Real-Time," Proc. 11th European Conf. Computer Vision (ECCV), 2010.
[20] S. Hadfield and R. Bowden, "Kinecting the Dots: Particle Based Scene Flow From Depth Sensors," Proc. IEEE Int'l Conf. Computer Vision (ICCV), 2011.
[21] J. Neumann and Y. Aloimonos, "Spatio-Temporal Stereo Using Multi-Resolution Subdivision Surfaces," Int'l J. Computer Vision, vol. 47, pp. 181-193, 2002.
[22] Y. Furukawa and J. Ponce, "Dense 3D Motion Capture from Synchronized Video Streams," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2008.
[23] J. Courchay, J. Pons, P. Monasse, and R. Keriven, "Dense and Accurate Spatio-Temporal Multi-View Stereovision," Proc. Ninth Asian Conf. Computer Vision (ACCV), 2009.
[24] F. Devernay, D. Mateus, and M. Guilbert, "Multi-Camera Scene Flow by Tracking 3-D Points and Surfels," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2006.
[25] R.L. Carceroni and K.N. Kutulakos, "Multi-View Scene Capture by Surfel Sampling: From Video Streams to Non-Rigid 3D Motion Shape and Reflectance," Proc. Eighth IEEE Int'l Conf. Computer Vision (ICCV), 2001.
[26] Y. Zhang and C. Kambhamettu, "Integrated 3D Scene Flow and Structure Recovery from Multiview Image Sequences," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2000.
[27] M. Isard and J. MacCormick, "Dense Motion and Disparity Estimation via Loopy Belief Propagation," Proc. Seventh Asian Conf. Computer Vision (ACCV), 2006.
[28] R. Li and S. Sclaroff, "Multi-Scale 3D Scene Flow from Binocular Stereo Sequences," Proc. IEEE Workshop Motion and Video Computing, 2005.
[29] Y. Zhang and C. Kambhamettu, "On 3D Scene Flow and Structure Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2001.
[30] Y. Zhang and C. Kambhamettu, "On 3D Scene Flow and Structure Recovery from Multiview Image Sequences," IEEE Trans. Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 33, no. 4, pp. 592-606, Aug. 2003.
[31] R. Li and S. Sclaroff, "Multi-Scale 3D Scene Flow from Binocular Stereo Sequences," Computer Vision and Image Understanding, vol. 110, pp. 75-90, 2008.
[32] J. Ruttle, M. Manzke, and R. Dahyot, "Estimating 3D Scene Flow from Multiple 2D Optical Flows," Proc. Int'l Machine Vision and Image Processing Conf., 2009.
[33] T. Basha, S. Avidan, A. Hornung, and W. Matusik, "Structure and Motion from Scene Registration," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012.
[34] D. Sun, S. Roth, J. Lewis, and M. Black, "Learning Optical Flow," Proc. 10th European Conf. Computer Vision (ECCV), 2008.
[35] N. Gordon, D. Salmond, and A.F.M. Smith, "Novel Approach to Nonlinear/Non-Gaussian Bayesian State Estimation," Proc. IEEE Conf. Radar and Signal Processing, 1993.
[36] M. Isard and A. Blake, "Condensation—Conditional Density Propagation for Visual Tracking," Int'l J. Computer Vision, vol. 29, pp. 5-28, 1998.
[37] P.D. Moral, Feynman-Kac Formulae: Genealogical and Interacting Particle Systems with Applications. Springer, 2004.
[38] R. Douc and O. Cappe, "Comparison of Resampling Schemes for Particle Filtering," Proc. Fourth Int'l Symp. Image and Signal Processing and Analysis, 2005.
[39] F. Huguet and F. Devernay, "A Variational Method for Scene Flow Estimation from Stereo Sequences," Proc. 11th IEEE Int'l Conf. Computer Vision (ICCV), 2007.
[40] T. Basha, Y. Moses, and N. Kiryati, "Multi-View Scene Flow Estimation: A View Centered Variational Approach," Int'l J. Computer Vision, vol. 101, pp. 6-21, 2012.
[41] H. Sidenbladh, "Probabilistic Tracking and Reconstruction of 3D Human Motion in Monocular Video Sequence," PhD dissertation, Stockholm Royal Inst. of Tech nology, 2001.
[42] L. Valgaerts, A. Bruhn, H. Zimmer, J. Weickert, C. Stoll, and C. Theobalt, "Joint Estimation of Motion Structure and Geometry from Stereo Sequences," Proc. 11th European Conf. Computer Vision (ECCV), 2010.
[43] M. Gong and Y.-H. Yang, "Disparity Flow Estimation Using Orthogonal Reliability-Based Dynamic Programming," Proc. 18th Int'l Conf. Pattern Recognition, 2006.
[44] D. Scharstein and R. Szeliski, "High-Accuracy Stereo Depth Maps Using Structured Light," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2003.
[45] T. Vaudrey, C. Rabe, R. Klette, and J. Milburn, "Differences Between Stereo and Motion Behaviour on Synthetic and Real-World Stereo Sequences," Proc. 23rd Int'l Conf. Image and Vision Computing New Zealand (IVCNZ), 2008.
[46] T. Brox and J. Malik, "Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 3, pp. 500-513, Mar. 2011.
[47] H. Hirschmuller, "Stereo Processing by Semiglobal Matching and Mutual Information," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 2, pp. 328-341, Feb. 2008.
[48] H.-P. Huang and C.-T. Lin, "Multi-Camshift for Multi-View Faces Tracking and Recognition," Proc. IEEE Conf. Robotics and Biomimetics (ROBIO), 2006.
[49] E. Efthimiou, S.-E. Fotinea, T. Hanke, J. Glauert, R. Bowden, A. Braffort, C. Collet, P. Maragos, and F. Lefebvre-Albaret, "Sign Language Technologies and Resources of the Dicta-Sign Project," Proc. Fifth Workshop Representation and Processing of Sign Languages, 2012.
[50] V. Pitsikalis, S. Theodorakis, C. Vogler, R. Athena, and P. Maragos, "Advances in Phonetics-Based Sub-Unit Modeling for Transcription Alignment and Sign Language Recognition," Proc. Workshop Gesture Recognition, 2011.
177 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool