This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Structure from Motion Causally Integrated Over Time
April 2002 (vol. 24 no. 4)
pp. 523-535

We describe an algorithm for reconstructing three-dimensional structure and motion causally, in real time from monocular sequences of images. We prove that the algorithm is minimal and stable, in the sense that the estimation error remains bounded with probability one throughout a sequence of arbitrary length. We discuss a scheme for handling occlusions (point features appearing and disappearing) and drift in the scale factor. These issues are crucial for the algorithm to operate in real time on real scenes. We describe in detail the implementation of the algorithm, which runs on a personal computer and has been made available to the community. We report the performance of our implementation on a few representative long sequences of real and synthetic images. The algorithm, which has been tested extensively over the course of the past few years, exhibits honest performance when the scene contains at least 20-40 points with high contrast, when the relative motion is slow compared to the sampling frequency of the frame grabber (30Hz), and the lens aperture is large enough (typically more than 30^o of visual field).

[1] G. Adiv, “Determining Three-Dimensional Motion and Structure from Optical Flow Generated by Several Moving Objects,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 7, no. 4, pp. 348-401, 1985.
[2] B. Anderson and J. Moore, Optimal Filtering. Prentice-Hall, 1979.
[3] A. Azarbayejani and A. Pentland, "Recursive Estimation of Motion, Structure and Focal Length," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, no. 6, pp. 562-575, June 1995.
[4] M.S. Bartlett, An Introduction to Stochastic Processes. CUP, 1956.
[5] T.J. Broida and R. Chellappa, "Estimation of Object Motion Parameters From Noisy Images," IEEE Trans. Pattern Analysis and Machine Intelligence, 1986, pp. 90-99.
[6] A. Chiuso, R. Brockett, and S. Soatto, “Optimal Structure from Motion: Local Ambiguities and Global Estimates,” Int'l J. Computer Vision, vol. 39, no. 3, pp. 195-228, 2000.
[7] A. Chiuso and S. Soatto, “3D Motion and Structure Causally Integrated over Time: Analysis,” Technical Report ESSRL, 99-03, Washington Univ., 1999.
[8] N. Cui, J. Weng, and P. Cohen, “Recursive-Batch Estimation of Motion and Structure from Monocular Image Sequences,” CVGIP: Image Understanding, vol. 59, no. 2, pp. 154-170, 1994.
[9] W. Dayawansa, B. Ghosh, C. Martin, and X. Wang, “A Necessary and Sufficient Condition for the Perspective Observability Problem,” Systems and Control Letters, vol. 25, no. 3, pp. 159-166, 1994.
[10] E.D. Dickmanns and V. Graefe, “Applications of Dynamic Monocular Machine Vision,” Machine Vision and Applications, vol. 1, pp. 241-261, 1988.
[11] O.D. Faugeras, Three-Dimensional Computer Vision: A Geometric Viewpoint.Cambridge, Mass.: MIT Press, 1993.
[12] C. Fermüller and Y. Aloimonos, “Tracking Facilitates 3-D Motion Estimation,” Biological Cybernetics, vol. 67, pp. 259-268, 1992.
[13] D.B. Gennery, “Tracking Known 3-Dimensional Object,” Proc. AAAI Second Nat'l Conf. Artifical Intelligence, pp. 13-17, 1982.
[14] J. Heel, "Direct Dynamic Motion Vision," Proc. IEEE Conf. Robotics and Automation, pp. 1,142-1,147, 1990.
[15] X. Hu and N. Ahuja, “Motion and Structure Estimation Using Long Sequence Motion Models,” Image and Vision Computing, vol. 11, no. 9, pp. 549-569, 1993.
[16] A.H. Jazwinski, Stochastic Processes and Filtering Theory. Academic Press, 1970.
[17] A. Jepson and D. Heeger, “Subspace Methods for Recovering Rigid Motion ii: Theory,” RBCV TR-90-35, Univ. of Toronto—CS Dept., Nov. 1990, revised, July 1991.
[18] H. Jin, P. Favaro, and S. Soatto, “Real-Time 3-D Motion and Structure from Point Features: A Front-End System for Vision-Based Control and Interaction,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, code available from, June 2000.
[19] J.J. Koenderink and A.J. Van Doorn, “Affine Structure from Motion,” J. Optical Soc. Am. vol. 8, no. 2, pp. 377-385, 1991.
[20] R. Kumar, P. Anandan, and K. Hanna, “Shape Recovery from Multiple Views: A Parallax Based Approach,” Proc. Image Understanding Workshop, 1994.
[21] B.D. Lucas and T. Kanade, “An Iterative Image Registration Technique with an Application to Stereo Vision,” Proc. Seventh Int'l Joint Conf. Artificial Intelligence, 1981.
[22] L. Matthies, R. Szelisky, and T. Kanade, “Kalman Filter-Based Algorithms for Estimating Depth from Image Sequences,” Int'l J. Computer Vision, pp. 2989-2994, 1989.
[23] P.F. McLauchlan, I.D. Reid, and D.W. Murray, “Recursive Affine Structure and Motion from Image Sequence,” Proc. Third European Conf. Computer Vision, pp. 217-224, Stockholm, Sweden, May 1994.
[24] P.F. McLauchlan, Gauge Invariance in Projective 3D Reconstruction Proc. Multi-View Workshop, 1999.
[25] J. Oliensis, “A New Structure from Motion Ambiguity,” IEEE Pattern Analysis and Machine Intelligence, vol. 22, no. 7, pp. 685-700, July 2000.
[26] J. Oliensis, “Provably Correct Algorithms for Multi-frame Structure from Motion,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1996.
[27] J. Oliensis and J. Inigo-Thomas, “Recursive Multi-Frame Structure from Motion Incorporating Motion Error,” Proc. DARPA Image Understanding Workshop, 1992.
[28] J. Philip, “Estimation of Three Dimensional Motion of Rigid Objects from Noisy Observations,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, no. 1, pp. 61-66, Jan. 1991.
[29] C.J. Poelman and T. Kanade, “A Paraperspective Factorization Method for Shape and Motion Recovery,” Proc. Third European Conf. Computer Vision, pp. 97-108, Stockholm, May 1994.
[30] K. Reif, S. Gunther, E. Yaz, and R. Unbenhauen, “Stochastic Stability of the Discrete-Time Extended Kalman Filte,” IEEE Trans. Automatic Control, vol. 44, no. 4, pp. 714-728, 1999.
[31] H.S. Sawhney, "Simplifying Motion and Structure Analysis Using Planar Parallax and Image Warping," Proc. Int'l Conf. Pattern Recognition, 1994.
[32] L. Shapiro, A. Zisserman, and M. Brady, “Motion from Point Matches Using Affine Epipolar Geometry,” Proc. European Conf. Comp. Vision, 1994.
[33] S. Soatto, “Observability/Identifiability of Rigid Motion under Perspective Projection,” Proc. 33rd IEEE Conf. Decision and Control, pp. 3235-3240, Dec. 1994.
[34] S. Soatto, “3-D Structure from Visual Motion: Modeling, Representation and Observability,” Automatica, vol. 33, pp. 1287-1312, 1997.
[35] S. Soatto and P. Perona, "Reducing "Structure From Motion": A General Framework for Dynamic Vision Part 1: Modeling," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 9, pp. 933-942, Sept. 1998.
[36] M. Spetsakis and Y. Aloimonos,“A multi-frame approach to visual motion perception, Int’l J. of Computer Vision, vol. 6, no. 3, pp. 245-255, Aug. 1991.
[37] R. Szeliski, “Recovering 3D Shape and Motion from Image Streams Using Nonlinear Least Squares,” J. Visual Comm. and Image Representation, 1994.
[38] M.A. Taalebinezhaad, "Direct Recovery of Motion and Shape in the General Case by Fixation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 14, no. 8, pp. 847-853, Aug. 1992.
[39] C. Tomasi and T. Kanade, "Shape and Motion From Image Streams Under Orthography: A Factorization Method," Int'l J. Computer Vision, vol. 9, no. 2, pp. 137-154, 1992.
[40] J. Weng, N. Ahuja, and T. Huang, "Optimal Motion and Structure Estimation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 9, pp. 864-884, 1993.
[41] Z. Zhang and O.D. Faugeras, "Three-dimensional motion computation and object segmentation in a long sequence of stereo frames," Int'l J. Computer Vision, vol. 7, no. 3, pp. 211-241, 1992.

Index Terms:
structure from motion, real-time vision, shape, geometry
Citation:
A. Chiuso, P. Favaro, H. Jin, S. Soatto, "Structure from Motion Causally Integrated Over Time," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 4, pp. 523-535, April 2002, doi:10.1109/34.993559
Usage of this product signifies your acceptance of the Terms of Use.