The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.10 - Oct. (2013 vol.35)
pp: 2371-2386
G. D. Evangelidis , Perception Team, INRIA Rhone-Alpes, Grenoble, France
C. Bauckhage , Fraunhofer IAIS, St. Augustin, Germany
ABSTRACT
This paper addresses the problem of video alignment. We present efficient approaches that allow for spatiotemporal alignment of two sequences. Unlike most related works, we consider independently moving cameras that capture a 3D scene at different times. The novelty of the proposed method lies in the adaptation and extension of an efficient information retrieval framework that casts the sequences as an image database and a set of query frames, respectively. The efficient retrieval builds on the recently proposed quad descriptor. In this context, we define the 3D Vote Space (VS) by aggregating votes through a multiquerying (multiscale) scheme and we present two solutions based on VS entries; a causal solution that permits online synchronization and a global solution through multiscale dynamic programming. In addition, we extend the recently introduced ECC image-alignment algorithm to the temporal dimension that allows for spatial registration and synchronization refinement with subframe accuracy. We investigate full search and quantization methods for short descriptors and we compare the proposed schemes with the state of the art. Experiments with real videos by moving or static cameras demonstrate the efficiency of the proposed method and verify its effectiveness with respect to spatiotemporal alignment accuracy.
INDEX TERMS
Cameras, Synchronization, Accuracy, Trajectory, Visualization, Detectors, Video sequences,short image descriptors, Video synchronization, spatiotemporal alignment, image/video retrieval
CITATION
G. D. Evangelidis, C. Bauckhage, "Efficient Subframe Video Alignment Using Short Descriptors", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 10, pp. 2371-2386, Oct. 2013, doi:10.1109/TPAMI.2013.56
REFERENCES
[1] Y. Caspi and M. Irani, "Spatio-Temporal Alignment of Sequences," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 11, pp. 1409-1424, Nov. 2002.
[2] Y. Caspi and M. Irani, "Aligning Non-Overlapping Sequences," Int'l J. Computer Vision, vol. 48, no. 1, pp. 39-51, 2002.
[3] G.D. Evangelidis and E.Z. Psarakis, "Parametric Image Alignment Using Enhanced Correlation Coefficient Maximization," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 10, pp. 1858-1865, Oct. 2008.
[4] D. Nister and H. Stewenius, "Scalable Recognition with a Vocabulary Tree," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2006.
[5] C. Liu, J. Yuen, and A. Torralba, "Sift Flow: Dense Correspondence across Scenes and Its Applications," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 5, pp. 978-994, May 2011.
[6] J. Sivic and A. Zisserman, "Efficient Visual Search of Videos Cast as Text Retrieval," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 591-606, Apr. 2009.
[7] D. Lang, D.W. Hogg, K. Mierle, M. Blanton, and S. Roweis, "Astrometry.net: Blind Astrometric Calibration of Arbitrary Astronomical Images," The Astronomical J., vol. 37, pp. 1782-2800, 2010.
[8] F. Diego, D. Ponsa, J. Serrat, and A.M. Lopez, "Video Alignment for Change Detection," IEEE Trans. Image Processing, vol. 20, no. 7, pp. 1858-1869, July 2011.
[9] P. Sand and S. Teller, "Video Matching," ACM Trans. Graphics, vol. 22, no. 3, pp. 592-599, 2004.
[10] D. Pundik and Y. Moses, "Video Synchronization Using Temporal Signals from Epipolar Lines," Proc. 11th European Conf. Computer Vision, 2010.
[11] Y. Ukrainitz and M. Irani, "Aligning Sequences and Actions by Maximizing Space-Time Correlations," Proc. European Conf. Computer Vision, 2006.
[12] A.M. Bronstein, M.M. Bronstein, and R. Kimmel, "The Video Genome," The Computing Research Repository, abs/1003.5320, 2010.
[13] L. Wolf and A. Zomet, "Wide Baseline Matching between Unsynchronized Video Sequences," Int'l J. Computer Vision, vol. 68, no. 1, pp. 43-52, 2006.
[14] M. Andriluka, P. Schnitzspan, J. Meyer, S. Kohlbrecher, K. Petersen, O. von Stryk, S. Roth, and B. Schiele, "Vision Based Victim Detection from Unmanned Aerial Vehicles," Proc. IEEE/RSJ Int'l Conf. Intelligent Robots and Systems, 2010.
[15] G.D. Evangelidis and C. Bauckhage, "Efficient and Robust Alignment of Unsynchronized Video Sequences," Proc. 33rd German Conf. Pattern Recognition, 2011.
[16] C. Rao, A. Gritai, M. Shah, and T. Syeda-Mahmood, "View-Invariant Alignment and Matching of Video Sequences," Proc. Ninth IEEE Int'l Conf. Computer Vision, 2003.
[17] A. Ravichandran and R. Vidal, "Video Registration Using Dynamic Textures," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 158-171, Jan. 2011.
[18] F. Padua, R. Carceroni, G. Santos, and K. Kutulakos, "Linear Sequence-to-Sequence Alignment," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 2, pp. 304-320, Feb. 2010.
[19] C. Lei and Y. Yang, "Trifocal Tensor-Based Multiple Video Synchronization with Subframe Optimization," IEEE Trans. Image Processing, vol. 15, no. 9, pp. 2473-2480, Sept. 2006.
[20] T. Tuytelaars and L.V. Gool, "Synchronizing Video Sequences," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2004.
[21] H. Kong, J.-Y. Audibert, and J. Ponce, "Detecting Abandoned Objects with a Moving Camera," IEEE Trans. Image Processing, vol. 19, no. 8, pp. 2201-2210, Aug. 2010.
[22] F. Fraundorfer, C. Engels, and D. Nister, "Topological Mapping, Localization and Navigation Using Image Collections," Proc. IEEE/RSJ Int'l Conf. Intelligent Robots and Systems, 2007.
[23] K.L. Ho and P. Newman, "Detecting Loop Closure with Scene Sequence," Int'l J. Computer Vision, vol. 74, no. 3, pp. 261-286, 2007.
[24] C. Manning, P. Raghavan, and H. Schütze, Introduction to Information Retrieval. Cambridge Univ., 2008.
[25] C. Harris and M. Stephens, "A Combined Corner and Edge Detector," Proc. Fourth Alvey Vision Conf., pp. 147-151, 1988.
[26] J. Canny, "A Computational Approach to Edge Detection," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 8, no. 6, pp. 679-698, Nov. 1986.
[27] C. Schmid, R. Mohr, and C. Bauckhage, "Evaluation of Interest Point Detectors," Int'l J. Computer Vision, vol. 37, no. 2, pp. 151-172, 2000.
[28] T. Lindeberg, Scale-Space Theory in Computer Vision. Springer, 1994.
[29] K. Mikolajczyk and C. Schmid, "Scale and Affine Invariant Interest Point Detectors," Int'l J. Computer Vision, vol. 60, no. 1, pp. 63-86, 2004.
[30] M. Brown, R. Szeliski, and S. Winder, "Multi-Image Matching Using Multi-Scale Oriented Patches," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2005.
[31] D. Lowe, "Distinctive Image Features from Scale Invariant Keypoints," Int'l J. Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[32] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, and L. Van Gool, "A Comparison of Affine Region Detectors," Int'l J. Computer Vision, vol. 65, nos. 1/2, pp. 43-72, 2005.
[33] G. Csurka, C.R. Dance, L. Fan, J. Willamowski, and C. Bray, "Visual Categorization with Bags of Keypoints," Proc. European Conf. Computer Vision Workshop Statistical Learning in Computer Vision, 2004.
[34] S. Dasgupta, C. Papadimitriou, and U. Vazirani, Algorithms. McGraw-Hill, 2006.
[35] P. Mainali, Q. Yang, G. Lafruit, R. Lauwereins, and L.V. Gool, "Lococo: Low Complexity Corner Detector," Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, 2010.
[36] R.I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, second ed. Cambridge Univ., 2004.
[37] S. Baker, R. Gross, I. Matthews, and T. Ishikawa, "Lucas-Kanade 20 Years On: A Unifying Framework: Part 2," Technical Report CMU-RI-TR-03-01, Robotics Inst., Carnegie Mellon Univ., Feb. 2003.
[38] M.A. Fischler and R.C. Bolles, "Random Sample Consensus: A Paradigm for Model Fitting with Application to Image Analysis and Automated Cryptography," Comm. ACM, vol. 24, no. 6, pp. 381-395, 1981.
75 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool