This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Spatio-Temporal Alignment of Sequences
November 2002 (vol. 24 no. 11)
pp. 1409-1424

Abstract—This paper studies the problem of sequence-to-sequence alignment, namely, establishing correspondences in time and in space between two different video sequences of the same dynamic scene. The sequences are recorded by uncalibrated video cameras which are either stationary or jointly moving, with fixed (but unknown) internal parameters and relative intercamera external parameters. Temporal variations between image frames (such as moving objects or changes in scene illumination) are powerful cues for alignment, which cannot be exploited by standard image-to-image alignment techniques. We show that, by folding spatial and temporal cues into a single alignment framework, situations which are inherently ambiguous for traditional image-to-image alignment methods, are often uniquely resolved by sequence-to-sequence alignment. Furthermore, the ability to align and integrate information across multiple video sequences both in time and in space gives rise to new video applications that are not possible when only image-to-image alignment is used.

[1] S. Baker, F. Dellaert, and I. Matthews, “Aligning Images Incrementally Backwards,” Technical Report CMU-RI-TR-01-03, Carnegie Mellon Univ., 2001.
[2] S. Baker and I. Matthews, “Equivalence and Efficiency of Image Alignment Algorithms,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, Dec. 2001.
[3] J.R. Bergen, P. Anandan, K.J. Hanna, and R. Hingorani, “Hiercharchical Model-Based Motion Estimation,” Proc. European Conf. Computer Vision, pp. 237-252, 1992.
[4] J.R. Bergen, P.J. Burt, R. Hingorani, and S. Peleg, "A Three-Frame Algorithm for Estimating Two-Component Image Motion," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 14, pp. 886-895, Sept. 1992.
[5] S. Birchfield, “KLT: An Implementation of the Kanade-Lucas-Tomasi Feature Tracker,” http://vision.stanford.edu/birchklt, 1996.
[6] P.J. Burt and E.H. Adelson, “The Laplacian Pyramid as a Compact Image Code,” IEEE Trans. Comm., vol. 31, no. 4, pp. 532-540, 1983.
[7] P.R. Burt and R.J. Kolczynski, “Enhanced Image Capture through Fusion,” Proc. Int'l Conf. Computer Vision, pp. 173-182, May 1993.
[8] Y. Caspi and M. Irani, A Step Towards Sequence-to-Sequence Alignment IEEE Proc. Conf. Computer Vision and Pattern Recognition, June 2000.
[9] Y. Caspi and M. Irani, “Alignment of Non-Overlaping Sequences,” Proc. Int'l Conf. Computer Vision, vol. II, pp. 76-83, 2001.
[10] M.A. Fischler and R.C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” Graphics and Image Processing, vol. 24, no. 6, pp. 381–395, June 1981.
[11] M.A. Giese and T. Poggio, “Recognition and Synthesis of Biological Motion Patterns by Linear Combination of Prototypical Motion Patterns,” Goettingen Neurobiology Report, N. Elsner and U. Eysel, eds., Stuttgart: Thieme Verlag, 1999.
[12] M.A. Giese and T. Poggio, “Morphable Models for the Analysis and Synthesis of Complex Motion Patterns,” Int'l J. Computer Vision, vol. 38, no. 1, pp. 59-73, 2000.
[13] F.R. Hampel, P.J. Rousseeuw, E. Ronchetti, and W.A. Stahel, Robust Statistics: The Approach Based on Influence Functions. New York: John Wiley, 1986.
[14] C.G. Harris and M. Stephens, “A Combined Corner and Edge Detector,” Proc. Fourth Alvey Vision Conf., pp. 147-151, 1988.
[15] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge Univ. Press, 2000.
[16] M. Irani and P. Anandan, “About Direct Methods,” Proc. Vision Algorithms Workshop, pp. 267-277, 1999.
[17] M. Irani and S. Peleg, “Improving Resolution by Image Restoration,” Computer Vision, Graphics, and Image Processing, vol. 53, pp. 231-239, 1991.
[18] M. Irani, B. Rousso, and S. Peleg, “Detecting and Tracking Multiple Moving Objects Using Temporal Integration,” Proc. European Conf. Computer Vision, pp. 282-287, May 1992.
[19] M. Irani, B. Rousso, and S. Peleg, “Computing Occluding and Transparent Motions,” Int'l J. Computer Vision, vol. 12, no. 1, pp. 5-16, Jan. 1994.
[20] M. Irani, B. Rousso, and P. Peleg, “Recovery of Ego-Motion Using Region Alignment,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 3, pp. 268-272, Mar. 1997.
[21] L. Lee, R. Romano, and G. Stein, Monitoring Activities from Multiple Video Streams: Establishing a Common Coordinate Frame IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 758-767, Aug. 2000.
[22] B.D. Lucas and T. Kanade, “An Iterative Image Registration Technique with an Application to Stereo Vision,” Proc. Image Understanding Workshop, pp. 121-130, 1981.
[23] P. Rousseeuw and A. Leory, Robust Regression and Outlier Detection. Wiley Series in Probability and Statistics, 1987.
[24] H. Sawhney and R. Kumar, “True Multi-Image Alignment and Its Application to Mosaicing and Lens Distortion Correction,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 450-456, 1997.
[25] E. Shechtman, Y. Caspi, and M. Irani, “Increasing Video Resolution in Time and Space,” Proc. European Conf. Computer Vision, 2002.
[26] G.P. Stein, “Tracking from Multiple View Points: Self-Calibration of Space and Time,” Proc. DARPA IU Workshop, pp. 1037-1042, 1998.
[27] C.V. Stewart, “Robust Parameter Estimation in Computer Vision,” SIAM Review, vol. 41, no. 3, pp. 513-537, 1999.
[28] T. Syeda-Mahmood, A. Vasilescu, and S. Sethi, “Recognition Action Events from Multiple Viewpoints,” Proc. IEEE Workshop Detection and Recognition of Events in Video, 2001.
[29] R. Szeliski and H.-Y. Shum, “Creating Full View Panoramic Image Mosaics and Environments Maps,” Proc. Computer Graphics, Ann. Conf. Series, vol. 8, pp. 251-258, 1997.
[30] C. Tomasi and T. Kanade, “Detection and Tracking of Point Features,” Technical Report CMU-CS-91-132, Carnegie Mellon Univ., Apr. 1991.
[31] P.H.S. Torr and A. Zisserman, “Feature Based Methods for Structure and Motion Estimation,” Proc. Vision Algorithms Workshop, pp. 279-29, 1999.
[32] G. Xu and Z. Zhang, Epipolar Geometry in Stereo, Motion, and Object Recognition: A Unified Approach. Kluwer Academic Publishers, 1996.
[33] Z. Zhang, R. Deriche, O. Faugeras, and Q.T. Luong, “A Rubust Technique for Matching Two Uncalibrated Images through the Recovery of the Unknown Epipolar Geometry,” Artificial Intelligence J., vol. 78, pp. 87-119, 1995.
[34] I. Zoghlami, O. Faugeras, and R. Deriche, “Using Geometric Corners to Build a 2D Mosaic from a Set of Images,” IEEE Conf. Computer Vision and Pattern Recognition, pp. 420-425, June 1997.

Index Terms:
Sequence-to-sequence alignment, space-time analysis, direct methods, feature-based methods.
Citation:
Yaron Caspi, Michal Irani, "Spatio-Temporal Alignment of Sequences," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 11, pp. 1409-1424, Nov. 2002, doi:10.1109/TPAMI.2002.1046148
Usage of this product signifies your acceptance of the Terms of Use.