This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Inferring Segmented Dense Motion Layers Using 5D Tensor Voting
September 2008 (vol. 30 no. 9)
pp. 1589-1602
We present a novel local spatiotemporal approach to produce motion segmentation and dense temporal trajectories from an image sequence. A common representation of image sequences is a 3D spatiotemporal volume, (x,y,t), and its corresponding mathematical formalism is the fiber bundle. However, directly enforcing the spatiotemporal smoothness constraint is difficult in the fiber bundle representation. Thus, we convert the representation into a new 5D space (x,y,t,vx,vy) with an additional velocity domain, where each moving object produces a separate 3D smooth layer. The smoothness constraint is now enforced by extracting 3D layers using the tensor voting framework in a single step that solves both correspondence and segmentation simultaneously. Motion segmentation is achieved by identifying those layers, and the dense temporal trajectories are obtained by converting the layers back into the fiber bundle representation. We proceed to address three applications (tracking, mosaic, and 3D reconstruction) that are hard to solve from the video stream directly because of the segmentation and dense matching steps, but become straightforward with our framework. The approach does not make restrictive assumptions about the observed scene or camera motion and is therefore generally applicable. We present results on a number of data sets.

[1] Middlebury College Stereo Evaluation Webpage, http://vision. middlebury.edustereo/, 2008.
[2] E. Adelson and Y. Weiss, “A Unified Mixture Framework for Motion Segmentation: Incorporating Spatial Coherence and Estimating the Number of Models,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 321-326, 1996.
[3] T. Amiaz and N. Kiryati, “Dense Discontinuous Optical Flow via Contour-Based Segmentation,” Proc. IEEE Int'l Conf. Image Processing, vol. 3, pp. 1264-1267, 2005.
[4] S. Ayer and H. Sawhney, “Layered Representation of Motion Video Using Robust Maximum-Likelihood Estimation of Mixture Models and MDL Encoding,” Proc. Fifth Int'l Conf. Computer Vision, pp. 777-784, 1995.
[5] H. Baker and R. Bolles, “Generalizing Epipolar-Plane Image Analysis on the Spatiotemporal Surface,” Int'l J. Computer Vision, vol. 3, no. 1, pp. 33-49, May 1989.
[6] M. Bleyer and M. Gelautz, “A Layered Stereo Algorithm Using Image Segmentation and Global Visibility Constraints,” Proc. IEEE Int'l Conf. Image Processing, vol. 5, pp. 2997-3000, 2004.
[7] R. Bolles, H. Baker, and D. Marimont, “Epipolar-Plane Image Analysis: An Approach to Determining Structure from Motion,” Int'l J. Computer Vision, vol. 1, no. 1, pp. 7-56, 1987.
[8] M. Brown and D. Lowe, “Recognising Panoramas,” Proc. Ninth IEEE Int'l Conf. Computer Vision, pp. 1218-1225, 2003.
[9] T. Brox, A. Bruhn, N. Papenberg, and J. Weickert, “High Accuracy Optical Flow Estimation Based on a Theory for Warping,” Proc. Eighth European Conf. Computer Vision, vol. 4, pp. 25-36, 2004.
[10] T. Brox, A. Bruhn, and J. Weickert, “Variational Motion Segmentation with Level Sets,” Proc. Ninth European Conf. Computer Vision, pp. 471-483, May 2006.
[11] D. Comaniciu and P. Meer, “Mean Shift: A Robust Approach toward Feature Space Analysis,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, May 2002.
[12] J. Costeira and T. Kanade, “A Multibody Factorization Method for Independently Moving Objects,” Int'l J. Computer Vision, vol. 29, no. 3, pp. 159-179, Sept. 1998.
[13] D. Cremers and S. Soatto, “Motion Competition: A Variational Approach to Piecewise Parametric Motion Segmentation,” Int'l J. Computer Vision, vol. 62, no. 3, pp. 249-265, May 2005.
[14] J. Davis, “Mosaics of Scenes with Moving Objects,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 354-360, 1998.
[15] H. Hirschmuller, “Stereo Vision in Structured Environments by Consistent Semi-Global Matching,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2386-2393, 2006.
[16] D. Husemoller, Fibre Bundles, third ed. Springer, 1993.
[17] K. Kanatani, “Motion Segmentation by Subspace Separation and Model Selection,” Proc. Eighth Int'l Conf. Computer Vision, vol. 2, pp. 586-591, 2001.
[18] A. Klaus, M. Sormann, and K. Karner, “Segment-Based Stereo Matching Using Belief Propagation and a Self-Adapting Dissimilarity Measure,” Proc. 18th Int'l Conf. Pattern Recognition, 2006.
[19] V. Kolmogorov and R. Zabih, “Computing Visual Correspondence with Occlusions via Graph Cuts,” Proc. Eighth Int'l Conf. Computer Vision, vol. 2, pp. 508-515, 2001.
[20] G. Medioni, M. Lee, and C. Tang, A Computational Framework for Segmentation and Grouping, first ed. Elsevier, 2000.
[21] E. Mémin and P. Pérez, “Hierarchical Estimation and Segmentation of Dense Motion Fields,” Int'l J. Computer Vision, vol. 46, no. 2, pp. 129-155, Feb. 2002.
[22] C. Min and G. Medioni, “Motion Segmentation by Spatiotemporal Smoothness Using 5D Tensor Voting,” Proc. Fifth IEEE Workshop Perceptual Organization in Computer Vision, 2006.
[23] C. Min and G. Medioni, “Tensor Voting Accelerated by Graphics Processing Units (GPU),” Proc. 18th Int'l Conf. Pattern Recognition, 2006.
[24] C. Min, Q. Yu, and G. Medioni, “Multi-Layer Mosaics in the Presence of Motion and Depth Effects,” Proc. 18th Int'l Conf. Pattern Recognition, 2006.
[25] M. Nicolescu and G. Medioni, “Layered 4D Representation and Voting for Grouping from Motion,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 4, pp. 492-501, Apr. 2003.
[26] C. Schnörr, “Determining Optical Flow for Irregular Domains by Minimizing Quadratic Functionals of a Certain Class,” Int'l J. Computer Vision, vol. 6, no. 1, pp. 25-38, Apr. 1991.
[27] H. Shum and R. Szeliski, “Systems and Experiment Paper: Construction of Panoramic Image Mosaics with Global and Local Alignment,” Int'l J. Computer Vision, vol. 36, no. 2, pp. 101-130, Feb. 2000.
[28] J. Sun, Y. Li, S. Kang, and H. Shum, “Symmetric Stereo Matching for Occlusion Handling,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 399-406, 2005.
[29] C. Tomasi and T. Kanade, “Shape and Motion from Image Streams under Orthography: A Factorization Method,” Int'l J. Computer Vision, vol. 9, no. 2, pp. 137-154, Nov. 1992.
[30] P. Torr and D. Murray, “The Development and Comparison of Robust Methods for Estimating the Fundamental Matrix,” Int'l J. Computer Vision, vol. 24, no. 3, pp. 271-300, Oct. 1997.
[31] M. Uyttendaele, A. Eden, and R. Szeliski, “Eliminating Ghosting and Exposure Artifacts in Image Mosaics,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 509-516, 2001.
[32] R. Vidal and R. Hartley, “Motion Segmentation with Missing Data Using Power Factorization and GPCA,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 310-316, 2004.
[33] J. Wang and E. Adelson, “Layered Representation for Motion Analysis,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 361-366, 1993.
[34] J. Xiao and M. Shah, “Motion Layer Extraction in the Presence of Occlusion Using Graph Cuts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1644-1659, Oct. 2005.
[35] Q. Yang, L. Wang, R. Yang, H. Stewenius, and D. Nister, “Stereo Matching with Color-Weighted Correlation, Hierarchical Belief Propagation and Occlusion Handling,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2347-2354, 2006.
[36] L. Zelnik-Manor and M. Irani, “Degeneracies, Dependencies and Their Implications in Multi-Body and Multi-Sequence Factorizations,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 287-293, 2003.

Index Terms:
Motion analysis, Tensor voting, Optical Flow, Segmentation, Mosaicking
Citation:
Changki Min, Gérard Medioni, "Inferring Segmented Dense Motion Layers Using 5D Tensor Voting," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 9, pp. 1589-1602, Sept. 2008, doi:10.1109/TPAMI.2007.70802
Usage of this product signifies your acceptance of the Terms of Use.