This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Unified Approach to Moving Object Detection in 2D and 3D Scenes
June 1998 (vol. 20 no. 6)
pp. 577-589

Abstract—The detection of moving objects is important in many tasks. Previous approaches to this problem can be broadly divided into two classes: 2D algorithms which apply when the scene can be approximated by a flat surface and/or when the camera is only undergoing rotations and zooms, and 3D algorithms which work well only when significant depth variations are present in the scene and the camera is translating. In this paper, we describe a unified approach to handling moving-object detection in both 2D and 3D scenes, with a strategy to gracefully bridge the gap between those two extremes. Our approach is based on a stratification of the moving object-detection problem into scenarios which gradually increase in their complexity. We present a set of techniques that match the above stratification. These techniques progressively increase in their complexity, ranging from 2D techniques to more complex 3D techniques. Moreover, the computations required for the solution to the problem at one complexity level become the initial processing step for the solution at the next complexity level. We illustrate these techniques using examples from real-image sequences.

[1] E.H. Adelson, "Layered Representations for Image Coding," Technical Report 181, MIT Media Lab, Vision and Modeling Group, Dec. 1991.
[2] G. Adiv, "Determining Three-Dimensional Motion and Structure From Optical Flow Generated by Several Moving Objects," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 7, no. 4, pp. 384-401, July 1985.
[3] G. Adiv, “Inherent Ambiguities in Recovering 3-D Motion and Structure from a Noisy Flow Field,” Trans. Pattern Analysis and Machine Intelligence, vol. 11, pp. 477–489, 1989.
[4] Y. Aloimonos, ed. Active Perception. Erlbaum, 1993.
[5] S. Ayer and H. Sawhney, "Layered Representation of Motion Video Using Robust Maximum-Likelihood Estimation of Mixture Models and mdl Encoding," Int'l Conf. Computer Vision, pp. 777-784,Cambridge, Mass., June 1995.
[6] J.R. Bergen, P. Anandan, K.J. Hanna, and R. Hingorani, "Hierarchical Model-Based Motion Estimation," European Conf. Computer Vision, pp. 237-252,Santa Margarita Ligure, May 1992.
[7] J.R. Bergen, P.J. Burt, R. Hingorani, and S. Peleg, "A Three-Frame Algorithm for Estimating Two-Component Image Motion," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 14, pp. 886-895, Sept. 1992.
[8] P.J. Burt, R. Hingorani, and R.J. Kolczynski, "Mechanisms for Isolating Component Patterns in the Sequential Analysis of Multiple Motion," IEEE Workshop Visual Motion, pp. 187-193,Princeton, N.J., Oct. 1991.
[9] J. Costeira and T. Kanade, "A Multi-Body Factorization Method for Motion Analysis," Int'l Conf. Computer Vision, pp. 1,071-1,076,Cambridge, Mass., June 1995.
[10] T. Darrell and A.P. Pentland,“Robust estimation of a multilayer motion representation,” Proc. IEEE Workshop on Visual Motion, pp. 173-177, 1991.
[11] O. Faugeras, Three-Dimensional Computer Vision.Cambridge, Mass.: M.I.T. Press, 1993.
[12] M. Irani and P. Anandan, "Parallax Geometry of Pairs of Points for 3D Scene Analysis," European Conf. Computer Vision,Cambridge, UK, Apr. 1996.
[13] M. Irani and P. Anandan, "A Unified Approach to Moving Object Detection in 2D and 3D Scenes," 13th Int'l Conf. Pattern Recognition, pp. 712-717,Vienna, Austria, Aug. 1996.
[14] M. Irani, B. Rousso, and S. Peleg, "Computing Occluding and Transparent Motions," Int'l J. Computer Vision, vol. 12, pp. 5-16, Feb. 1994.
[15] M. Irani, B. Rousso, and P. Peleg, “Recovery of Ego-Motion Using Region Alignment,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 3, pp. 268-272, Mar. 1997.
[16] S.X. Ju, M.J. Black, and A.D. Jepson, Skin and Bones: Multi-Layer, Locally Affine, Optical Flow and Regularization with Transparency Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 307-314, 1996.
[17] J.J. Koenderink and A.J. van Doorn, "Representation of Local Geometry in the Visual System," Biol. Cybern., vol. 55, pp. 367-375, 1987.
[18] R. Kumar, P. Anandan, and K. Hanna, “Direct Recovery of Shape from Multiple Views: A Parallax Based Approach,” Proc. Int'l Conf. Pattern Recognition, pp. 685-688, Oct. 1994.
[19] R. Kumar, P. Anandan, and K. Hanna, "Shape Recovery From Multiple Views: A Parallax Based Approach," DARPA IU Workshop,Monterey, Calif., Nov. 1994.
[20] R. Kumar, P. Anandan, M. Irani, J. Bergen, and K. Hanna, “Representation of Scenes from Collections of Images,“ Proc. IEEE Workshop Representation of Visual Scenes, pp. 10-17, June 1995.
[21] J.M. Lawn and R. Cipolla, "Robust Egomotion Estimation From Affine Motion Parallax," European Conf. Computer Vision, pp. 205-210, May 1994.
[22] H.C. Longuet-Higgins, "Visual Ambiguity of a Moving Plane," Proc. Royal Soc. London, Series B, vol. 223, pp. 165-175, 1984.
[23] H.C. Longuet-Higgins and K. Prazdny, "The Interpretation of a Moving Retinal Image," Proc. Royal Soc. London, Series B, vol. 208, pp. 385-397, 1980.
[24] F. Meyer and P. Bouthemy, "Region-Based Tracking in Image Sequences," European Conf. Computer Vision, pp. 476-484,Santa Margarita Ligure, May 1992.
[25] J.H. Rieger and D.T. Lawton, "Processing Differential Image Motion," J. Optical Soc. Am. A, vol. A2, no. 2, pp. 354-359, 1985.
[26] H.S. Sawhney, "3D Geometry From Planar Parallax," Proc. CVPR '94, pp. 929-934, 1994.
[27] A. Shashua and N. Navab, "Relative Affine Structure: Theory and Application to 3D Reconstruction From Perspective Views," Proc. CVPR '94, pp. 483-489, 1994.
[28] M. Shizawa and K. Mase, "Principle of Superposition: A Common Computational Framework for Analysis of Multiple Motion," IEEE Workshop Visual Motion, pp. 164-172,Princeton, N.J., Oct. 1991.
[29] W.B. Thompson and T.C. Pong, "Detecting Moving Objects," Int'l J. Computer Vision, vol. 4, pp. 29-57, 1990.
[30] P.H.S. Torr and D.W. Murray, "Stochastic Motion Clustering," European Conf. Computer Vision, pp. 328-337, May 1994.
[31] P.H.S. Torr, A. Zisserman, and S.J. Maybank, "Robust Detection of Degenerate Configurations for the Fundamental Matrix," Int'l Conf. Computer Vision, pp. 1,037-1,042,Cambridge, Mass., June 1995.
[32] B.C. Vemuri, S. Huang, S. Sahni, "A Robust and Efficient Algorithm for Image Registration," Proc. of 15th Int'l Conf. Information Processing in Medical Imaging,Poultney, VT, pp. 465-470, 1997.
[33] J.Y.A. Wang and E. Adelson, "Layered Representation for Motion Analysis," Proc. Computer Vision and Pattern Recognition Conf., 1993.

Index Terms:
Moving object detection, rigidity constraints, multiframe analysis, planar-parallax, parallax geometry, layers.
Citation:
Michal Irani, P. Anandan, "A Unified Approach to Moving Object Detection in 2D and 3D Scenes," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 6, pp. 577-589, June 1998, doi:10.1109/34.683770
Usage of this product signifies your acceptance of the Terms of Use.