This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Layered Motion Segmentation and Depth Ordering by Tracking Edges
April 2004 (vol. 26 no. 4)
pp. 479-494
Paul Smith, IEEE Computer Society
Tom Drummond, IEEE Computer Society

Abstract—This paper presents a new Bayesian framework for motion segmentation—dividing a frame from an image sequence into layers representing different moving objects—by tracking edges between frames. Edges are found using the Canny edge detector, and the Expectation-Maximization algorithm is then used to fit motion models to these edges and also to calculate the probabilities of the edges obeying each motion model. The edges are also used to segment the image into regions of similar color. The most likely labeling for these regions is then calculated by using the edge probabilities, in association with a Markov Random Field-style prior. The identification of the relative depth ordering of the different motion layers is also determined, as an integral part of the process. An efficient implementation of this framework is presented for segmenting two motions (foreground and background) using two frames. It is then demonstrated how, by tracking the edges into further frames, the probabilities may be accumulated to provide an even more accurate and robust estimate, and segment an entire sequence. Further extensions are then presented to address the segmentation of more than two motions. Here, a hierarchical method of initializing the Expectation-Maximization algorithm is described, and it is demonstrated that the Minimum Description Length principle may be used to automatically select the best number of motion layers. The results from over 30 sequences (demonstrating both two and three motions) are presented and discussed.

[1] M. Irani, P. Anandan, J. Bergen, R. Kumar, and S. Hsu, Efficient Representations of Video Sequences and Their Representations Signal Processing: Image Comm., vol. 8, no. 4, pp. 327-351, May 1996.
[2] H.S. Sawhney and S. Ayer, "Compact Representations of Videos Through Dominant and Multiple Motion Estimation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 8, pp. 814-831, 1996.
[3] Information Technology Coding of Audio-Visual Objects, ISO/IEC 14496, MPEG-4 Standard, 1999-2002.
[4] M. Gelgon and P. Bouthemy, Determining a Structured Spatio-Temporal Representation of Video Content for Efficient Visualisation and Indexing Proc. Fifth European Conf. Computer Vision (ECCV '98), pp. 595-609, 1998.
[5] M. Irani and P. Anandan, Video Indexing Based on Mosaic Representations IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 86, no. 5, pp. 905-921, 1998.
[6] B.K.P. Horn and B.G. Schunk, Determining Optical Flow Artificial Intelligence, vol. 17, nos. 1-3, pp. 185-203, Aug. 1981.
[7] G. Adiv, Determining Three-Dimensional Motion and Structure from Optical Flow Generated by Several Moving Objects IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 7, no. 4, pp. 384-401, July 1985.
[8] D.W. Murray and B.F. Buxton, Scene Segmentation from Visual Motion Using Global Optimization IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 9, no. 2, pp. 220-228, Mar. 1987.
[9] A.D. Jepson and M. Black, “Mixture Models for Optical Flow Computation,” Proc. Computer Vision and Pattern Recognition, pp. 760-761, June 1993.
[10] J.Y.A. Wang and E. Adelson, "Layered Representation for Motion Analysis," Proc. Computer Vision and Pattern Recognition Conf., 1993.
[11] J.Y.A. Wang and E.H. Adelson, Representing Moving Images with Layers IEEE Trans. Image Processing, vol. 3, no. 5, pp. 625-638, Sept. 1994.
[12] S. Ayer, P. Schroeter, and J. Bigün, Segmentation of Moving Objects by Robust Motion Parameter Estimation Over Multiple Frames Proc. Third European Conf. Computer Vision (ECCV '94), pp. 317-327, 1994.
[13] M. Irani, B. Rousso, and S. Peleg, Computing Occluding and Transparent Motions Int'l J. Computer Vision, vol. 12, no. 1, pp. 5-16, Jan. 1994.
[14] J.M. Odobez and P. Bouthemy, Separation of Moving Regions from Background in an Image Sequence Acquired with a Mobile Camera Video Data Compression for Multimedia Computing. pp. 283-311, Dordrecht, The Netherlands, Kluwer Academic Publishers, 1997.
[15] G. Csurka and P. Bouthemy, Direct Identification of Moving Objects and Background from 2D Motion Models Proc. Seventh Int'l Conf. Computer Vision (ICCV '99), pp. 566-571, 1999.
[16] J.Y.A. Wang and E.H. Adelson, Spatio-Temporal Segmentation of Video Data Proc. SPIE: Image and Video Processing II, pp. 130-131, 1994.
[17] A.P. Dempster, H.M. Laird, and D.B. Rubin, Maximum Likelihood from Incomplete Data via the EM Algorithm J. of Royal Statistical Soc.: Series B (Methodological), vol. 39, no. 1, pp. 1-38, Jan. 1977.
[18] S. Ayer and H. Sawhney, "Layered Representation of Motion Video Using Robust Maximum-Likelihood Estimation of Mixture Models and mdl Encoding," Int'l Conf. Computer Vision, pp. 777-784,Cambridge, Mass., June 1995.
[19] Y. Weiss and E. Adelson, “A Unified Mixture Framework for Motion Segmentation: Incorporating Spatial Coherence and Estimating the Number of Models,” Proc. IEEE Computer Soc. Conf. Computer Vision and Pattern Recognition, pp. 321-326, 1996.
[20] S. Geman and D. Geman, Stochastic Relaxation, Gibbs Distribution and the Bayesian Restoration of Images IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, no. 6, pp. 721-741, Nov. 1984.
[21] J.M. Odobez and P. Bouthemy, Direct Incremental Model-Based Image Motion Segmentation for Video Analysis Signal Processing, vol. 66, no. 2, pp. 143-155, Apr. 1998.
[22] J. Shi and J. Malik, Motion Segmentation and Tracking Using Normalized Cuts Proc. Sixth Int'l Conf. Computer Vision (ICCV '98), pp. 1154-1160, 1998.
[23] P. Giaccone and G. Jones, Segmentation of Global Motion Using Temporal Probabilistic Classification Proc. Ninth British Machine Vision Conference (BMVC '98), vol. 2, pp. 619-628, 1998.
[24] W.B. Thompson, Combining Motion and Contrast for Segmentation IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 2, no. 6, pp. 543-549, Nov. 1980.
[25] F. Dufaux, F. Moscheni, and A. Lippman, “Spatiotemporal Segmentation Based on Motion and Static Ssegmentation,” Proc. IEEE Int'l Conf. Image Processing, vol. 1, pp. 306-309, Oct. 1995.
[26] F. Moscheni and S. Bhattacharjee, Robust Region Merging for Spatio-Temporal Segmentation Proc. Int'l Conf. Image Processing (ICIP), vol. 1, pp. 501-504, 1996.
[27] F. Moscheni and F. Dufaux, Region Merging Based on Robust Statistical Testing Proc. SPIE Int'l Conf. Visual Communications and Image Processing (VCIP '96), 1996.
[28] F. Moscheni, S. Bhattacharjee, and M. Kunt, “Spatiotemporal Segmentation Based on Region Merging,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 9, pp. 897-915, Sept. 1998.
[29] L. Bergen and F. Meyer, Motion Segmentation and Depth Ordering Based on Morphological Segmentation Proc. Fifth European Conf. Computer Vision (ECCV '98), pp. 531-547, 1998.
[30] D. Tweed and A. Calway, Integrated Segmentation and Depth Ordering of Motion Layers in Image Sequences Proc. 11th British Machine Vision Conference (BMVC 2000), pp. 322-331, 2000.
[31] M.J. Black and D.J. Fleet, Probabilistic Detection and Tracking of Motion Boundaries Int'l J. Computer Vision, vol. 38, no. 3 pp. 229-243, July 2000.
[32] L. Gaucher and G. Medioni, Accurate Motion Flow Estimation with Discontinuities Proc. Seventh Int'l Conf. Computer Vision (ICCV '99), pp. 695-702, Sept. 1999.
[33] J.F. Canny, A Computational Approach to Edge Detection IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679-698, Nov. 1986.
[34] T. Drummond and R. Cipolla, Application of Lie Algebras to Visual Servoing Int'l J. Computer Vision, vol. 37, no. 1, pp. 21-41, June 2000.
[35] W.J.J. Rey, Introduction to Robust and Quasi-Robust Statistical Methods. Springer-Verlag, Berlin 1978.
[36] P.A. Smith, Edge-Based Motion Segmentation PhD dissertation, Univ. of Cambridge, UK, Aug. 2001.
[37] P. Smith, T. Drummond, and R. Cipolla, Motion Segmentation by Tracking Edge Information over Multiple Frames Proc. Sixth European Conf. Computer Vision (ECCV 2000), pp. 396-410, 2000.
[38] D. Sinclair, Voronoi Seeded Colour Image Segmentation Technical Report, 1999.3, AT&T Laboratories, Cambridge, UK, 1999.
[39] L. Vincent and P. Soille, "Watersheds in Digital Spaces: An Efficient Algorithm Based on Immersion Simulations," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, pp. 583-598, 1991.
[40] S. Kirkpatrick, C.D. Gelatt, and M.P. Vecchi, Optimization by Simulated Annealing Science, vol. 220, no. 4598, pp. 671-680, May 1983.
[41] P.H.S. Torr, An Assessment of Information Criteria for Model Selection Proc. Conf. Computer Vision and Pattern Recognition, pp. 47-53, 1997.
[42] N. Ueda and R. Nakano, Deterministic Annealing EM Algorithm Neural Networks, vol. 11, no. 2, pp. 271-282, Apr. 1998.

Index Terms:
Video analysis, motion, segmentation, depth cues.
Citation:
Paul Smith, Tom Drummond, Roberto Cipolla, "Layered Motion Segmentation and Depth Ordering by Tracking Edges," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 4, pp. 479-494, April 2004, doi:10.1109/TPAMI.2004.1265863
Usage of this product signifies your acceptance of the Terms of Use.