This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Spatiotemporal Stereo and Scene Flow via Stequel Matching
June 2012 (vol. 34 no. 6)
pp. 1206-1219
M. Sizintsev, Dept. of Comput. Sci. & Eng., York Univ., Toronto, ON, Canada
R. P. Wildes, Dept. of Comput. Sci. & Eng., York Univ., Toronto, ON, Canada
This paper is concerned with the recovery of temporally coherent estimates of 3D structure and motion of a dynamic scene from a sequence of binocular stereo images. A novel approach is presented based on matching of spatiotemporal quadric elements (stequels) between views, as this primitive encapsulates both spatial and temporal image structure for 3D estimation. Match constraints are developed for bringing stequels into correspondence across binocular views. With correspondence established, temporally coherent disparity estimates are obtained without explicit motion recovery. Further, the matched stequels also will be shown to support direct recovery of scene flow estimates. Extensive algorithmic evaluation with ground truth data incorporated in both local and global correspondence paradigms shows the considerable benefit of using stequels as a matching primitive and its advantages in comparison to alternative methods of enforcing temporal coherence in disparity estimation. Additional experiments document the usefulness of stequel matching for 3D scene flow estimation.

[1] E.H. Adelson and J.R. Bergen, "Spatiotemporal Energy Models for the Perception of Motion," J. Optical Soc. Am., vol. 2, no. 2, pp. 284-299, 1985.
[2] S. Baker and I. Matthews, "Lucas-Kanade 20 Years On: A Unifying Framework," Int'l J. Computer Vision, vol. 56, no. 3, pp. 221-255, 2004.
[3] J. Bigun, Vision with Direction. Springer, 1998.
[4] M.J. Black and P. Anandan, "The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields," J. Computer Vision and Image Understanding, vol. 61, no. 1, pp. 75-104, 1996.
[5] K. Cannons and R.P. Wildes, "Visual Tracking Using Pixelwise Spatiotemporal Oriented Energy Representation," Proc. 11th European Conf. Computer Vision, pp. 511-524, 2010.
[6] O. Chomat, J. Martin, and J. Crowley, "A Probabilistic Sensor for the Perception and the Recognition of Activities," Proc. Sixth European Conf. Computer Vision, vol. 1, pp. 487-503, 2000.
[7] J. Davis, R. Ramamoorthi, and S. Rusinkiewicz, "Spacetime Stereo: A Unifying Framework for Depth from Triangulation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 2, pp. 296-302, Feb. 2005.
[8] D. Demirdjian and T. Darrell, "Using Multiple-Hypothesis Disparity Maps and Image Velocity for 3D Motion Estimation," Int'l J. Computer Vision, vol. 47, pp. 219-228, 2002.
[9] K. Derpanis, M. Sizintsev, K. Cannons, and R. Wildes, "Efficient Action Spotting Based on a Spacetime Oriented Structure Representation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[10] K. Derpanis and R. Wildes, "Dynamic Texture Recognition Based on Distributions of Spacetime Oriented Structure" Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[11] K.G. Derpanis and P. Chang, "Closed-Form Linear Solution to Motion Estimation in Disparity Space," Proc. IEEE Intelligent Vehicles Symp., 2006.
[12] K.G. Derpanis and J. Gryn, "Three-Dimensional n-th Derivative of Gaussian Separable Steerable Filters," Proc. IEEE Int'l Conf. Image Processing, vol. 3, pp. 553-556, 2005.
[13] K.G. Derpanis and R.P. Wildes, "Early Spatiotemporal Grouping with a Distributed Oriented Energy Representation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[14] P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie, "Behavior Recognition via Sparse Spatio-Temporal Features," Proc. IEEE Int'l Workshop Visual Surveillance and Performance Evaluation of Tracking and Surveillance, 2005.
[15] M. Fahle and T. Poggio, "Visual Hyperacuity: Spatiotemporal Interpolation in Human Vision," Proc. Royal Soc. of London, vol. 213, no. 1193, pp. 451-477, 1981.
[16] M. Fiala, "ARTag, a Fiducial Marker System Using Digital Techniques," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 590-596, 2005.
[17] U. Franke, C. Rabe, H. Badino, and S. Gehrig, "6D-Vision: Fusion of Stereo and Motion for Robust Environment Perception," Proc. DAGM Symp., pp. 216-223, 2005.
[18] W.T. Freeman and E.H. Adelson, "The Design and Use of Steerable Filters," IEEE Trans Pattern Analysis and Machine Intelligence, vol. 13, no. 9, pp. 891-906, Sept. 1991.
[19] M. Gong, "Enforcing Temporal Consistency in Real-Time Stereo Estimation," Proc. European Conf. Computer Vision, pp. 564-577, 2006.
[20] G. Granlund and H. Knutsson, Signal Processing for Computer Vision. Kluwer, 1995.
[21] K.J. Hanna and N.E. Okamoto, "Combining Stereo and Motion Analysis for Direct Estimation of Scene Structure," Proc. Fourth IEEE Int'l Conf. Computer Vision, pp. 357-365, 1993.
[22] D. Heeger, "A Model for the Extraction of Image Flow," J. Optical Soc. Am. A, vol. 4, pp. 1455-1471, 1997.
[23] F. Huguet and F. Devernay, "A Variational Method for Scene Flow Estimation from Stereo Sequences," Proc. 11th IEEE Int'l Conf. Computer Vision, pp. 1-7, 2007.
[24] M. Isard and J. MacCormick, "Dense Motion and Disparity Estimation via Loopy Belief Propagation," Proc. Asian Conf. Computer Vision, pp. 32-41, 2006.
[25] M. Jenkin and J.K. Tsotsos, "Applying Temporal Constraints to the Dynamic Stereo Problem," J. Computer Vision, Graphics, and Image Processing, vol. 33, pp. 16-32, 1986.
[26] D.G. Jones and J. Malik, "A Computational Framework for Determining Stereo Correspondence from a Set of Linear Spatial Filters," Proc. Second European Conf. Computer Vision, pp. 395-410, 1992.
[27] A. Klaser, M. Marszalek, and C. Schmid, "A Spatio-Temporal Descriptor Based on 3D-Gradients," Proc. British Machine Vision Conf., 2008.
[28] V. Kolmogorov and R. Zabih, "Computing Visual Correspondence with Occlusions Using Graph Cuts," Proc. Eighth IEEE Int'l Conf. Computer Vision, pp. 508-515, 2001.
[29] G.A. Korn and T.M. Korn, Mathematical Handbook for Scientists and Engineers, second ed. McGraw-Hill Companies, 1976.
[30] K.N. Kutulakos and S.M. Seitz, "A Theory of Shape by Space Carving," Int'l J. Computer Vision, vol. 38, no. 3, pp. 199-218, 2000.
[31] E.S. Larsen, P. Mordohai, M. Pollefeys, and H. Fuchs, "Temporally Consistent Reconstruction from Multiple Video Streams," Proc. 11th IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007.
[32] C. Leung, B. Appleton, B.C. Lovell, and C. Sun, "An Energy Minimisation Approach to Stereo-Temporal Dense Reconstruction," Proc. 17th Int'l Conf. Pattern Recognition, pp. 72-75, 2004.
[33] S. Malassiotis and M.G. Strintzis, "Model-Based Joint Motion and Structure Estimation from Stereo Images," J. Computer Vision and Image Understanding, vol. 65, no. 1, pp. 79-94, 1997.
[34] R. Mandelbaum, G. Salgian, and H. Sawhney, "Correlation-Based Estimation of Ego-Motion and Structure from Motion and Stereo," Proc. Seventh IEEE Int'l Conf. Computer Vision, pp. 544-550, 1999.
[35] G. Medioni, C.-K. Tang, and M.-S. Lee, "Tensor Voting: Theory and Applications," Proc. 12th Congres Francophone AFRIF-AFIA de Reconnaissance des Formes et Intelligence Artificielle, 2000.
[36] J. Neumann and Y. Aloimonos, "Spatio-Temporal Stereo Using Multi-Resolution Subdivision Surfaces," Int'l J. Computer Vision, vol. 47, nos. 1-3, pp. 181-193, 2002.
[37] Point Grey Research, http:/www.ptgrey.com, 2010.
[38] J.-P. Pons, R. Keriven, O. Faugeras, and G. Hermosillo, "Variational Stereo and 3D Scene Flow Estimation with Statistical Similarity Measures," Proc. Ninth IEEE Int'l Conf. Computer Vision, pp. 597-602, 2003.
[39] W. Richards, "Structure from Stereo and Motion," J. Optical Soc. Am. A, vol. 2, pp. 343-349, 1985.
[40] H. Scharr and R. Kusters, "A Linear Model for Simultaneous Estimation of 3D Motion and Depth," Proc. Workshop Motion and Video Computing, pp. 220-225, 2002.
[41] H. Scharr and T. Schuchert, "Simultaneous Motion, Depth and Slope Estimation with a Camera-Grid," Proc. Vision, Modelling and Visualization, pp. 81-88, 2006.
[42] D. Scharstein and R. Szeliski, "High-Accuracy Stereo Depth Maps Using Structured Light," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 195-202, 2003.
[43] E. Shechtman and M. Irani, "Space-Time Behavior-Based Correlation—or—How to Tell If Two Underlying Motion Fields Are Similar without Computing Them?" IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 11, pp. 2045-2056, Nov. 2007.
[44] M. Sizintsev, http://www.cse.yorku.ca/vision/research spatiotemporal-stereo-stequel.shtml , 2011.
[45] M. Sizintsev and R.P. Wildes, "Spatiotemporal Stereo via Spatiotemporal Quadric Element (Stequel) Matching," Technical Report CS-2008-04, York Univ., 2008.
[46] M. Sizintsev and R.P. Wildes, "Spatiotemporal Stereo via Spatiotemporal Quadric Element (Stequel) Matching," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[47] M. Sizintsev and R.P. Wildes, "Coarse-to-Fine Stereo Vision with Accurate 3D Boundaries," J. Image and Vision Computing, vol. 28, no. 3, pp. 352-366, 2010.
[48] G.P. Stein and A. Shashua, "Direct Estimation of Motion and Extended Scene Structure from a Moving Stereo Rig," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 211-218, 1998.
[49] G. Strang, Introduction to Applied Mathematics. Wellesley-Cambridge Press, 1986.
[50] C. Strecha and L. van Gool, "Motion-Stereo Integration for Depth Estimation," Proc. Seventh European Conf. Computer Vision, pp. 170-185, 2002.
[51] G. Sudhir, S. Baneerjee, K.K. Biswas, and R. Bahl, "Cooperative Integration of Stereopsis and Optic Flow Computation," J. Optical Soc. Am. A, vol. 12, no. 12, pp. 2564-2572, 1995.
[52] S. Vedula, S. Baker, S.M. Seitz, and T. Kanade, "Shape and Motion Carving in 6D," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2592-2598, 2000.
[53] A.M. Waxman and J.H. Duncan, "Binocular Image Flows: Steps Toward Stereo-Motion Fusion," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 715-729, Nov. 1986.
[54] J. Weng, P. Cohen, and N. Rebibo, "Motion and Structure Estimation from Stereo Image Sequences," IEEE Trans. Robotics and Automation, vol. 8, pp. 362-382, June 1992.
[55] O. Williams, M. Isard, and J. MacCormick, "Estimating Disparity and Occlusions in Stereo Video Sequences," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 250-257, 2005.
[56] A. Zaharescu and R.P. Wildes, "Anomalous Behaviour Detection Using Spatiotemporal Oriented Energies Subset Inclusion Histogram Comparison and Event-Driven Processing," Proc. 11th European Conf. Computer Vision, 2010.
[57] L. Zhang, B. Curless, and S.M. Seitz, "Spacetime Stereo: Shape Recovery for Dynamic Scenes," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 367-374, 2003.
[58] Y. Zhang and C. Kambhamettu, "On 3D Scene Flow and Structure Estimation," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 778-785, 2001.
[59] Z. Zhang and O.D. Faugeras, "Three-Dimensional Motion Computation and Object Segmentation in a Long Sequence of Stereo Frames," Int'l J. Computer Vision, vol. 7, no. 3, pp. 211-241, 1992.

Index Terms:
stereo image processing,image matching,image motion analysis,image sequences,3D scene flow estimation,spatiotemporal stereo,stequel matching,3D structure,dynamic scene motion,binocular stereo image sequence,spatiotemporal quadric elements,image structure,scene flow estimates,ground truth data,Spatiotemporal phenomena,Three dimensional displays,Estimation,Optical imaging,Cameras,Stereo vision,Pattern analysis,stequel.,Stereo,motion,spacetime,spatiotemporal,scene flow,quadric element
Citation:
M. Sizintsev, R. P. Wildes, "Spatiotemporal Stereo and Scene Flow via Stequel Matching," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 6, pp. 1206-1219, June 2012, doi:10.1109/TPAMI.2011.202
Usage of this product signifies your acceptance of the Terms of Use.