The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.03 - March (2013 vol.35)
pp: 697-715
R. Li , Zickler Group, Harvard Univ., Cambridge, MA, USA
R. Chellappa , Center for Autom. Res., Univ. of Maryland, College Park, MD, USA
ABSTRACT
We investigate the problem of spatiotemporal alignment of videos, signals, or feature sequences extracted from them. Specifically, we consider the scenario where the spatiotemporal misalignments can be characterized by parametric transformations. Using a nonlinear analytical structure referred to as an alignment manifold, we formulate the alignment problem as an optimization problem on this nonlinear space. We focus our attention on semantically meaningful videos or signals, e.g., those describing or capturing human motion or activities, and propose a new formalism for temporal alignment accounting for executing rate variations among instances of the same video event. The strategy taken in this effort bridges the family of geometric optimization and the family of stochastic algorithms: We regard the search for optimal alignment parameters as a recursive state estimation problem for a particular dynamic system evolving on the alignment manifold. Subsequently, a Sequential Importance Sampling procedure on the alignment manifold is designed for effective alignment. We further extend the basic Sequential Importance Sampling algorithm into a new version called Stochastic Gradient Sequential Importance Sampling, in which we incorporate a steepest descent structure on the alignment manifold and provide a more efficient particle propagation mechanism. We demonstrate the performance of alignment using manifolds on several types of input data that arise in vision problems.
INDEX TERMS
Manifolds, Cameras, Videos, Optimization, Stochastic processes, Heuristic algorithms, Algorithm design and analysis, geometric methods, Spatiotemporal alignment, video matching, stochastic optimization
CITATION
R. Li, R. Chellappa, "Spatiotemporal Alignment of Visual Signals on a Special Manifold", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 3, pp. 697-715, March 2013, doi:10.1109/TPAMI.2012.144
REFERENCES
[1] L. Lee, R. Romano, and G. Stein, "Monitoring Activities from Multiple Video Streams: Establishing a Common Coordinate Frame," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 758-767, Aug. 2000.
[2] L. Wolf and A. Zomet, "Sequence to Sequence Self Calibration," Proc. European Conf. Computer Vision, pp. 370-382, 2002.
[3] C. Rao, A. Gritaiand, M. Shah, and T. Syeda-Mahmood, "View-Invariant Alignment and Matching of Video Sequences," Proc. Ninth IEEE Int'l Conf. Computer Vision, pp. 939-945, 2003.
[4] I. Laptev, S. Belongie, P. Perez, and J. Wills, "Periodic Motion Detection and Segmentation via Approximate Sequence Alignment," Proc. 10th IEEE Int'l Conf. Computer Vision, pp. 816-823, 2005.
[5] Y. Caspi and M. Irani, "Spatio-Temporal Alignment of Sequences," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 11, pp. 1409-1424, Nov. 2002.
[6] L. Wolf and A. Zomet, "Wide Baseline Matching between Unsynchronized Video Sequences," Int'l J. Computer Vision, vol. 68, no. 1, pp. 43-52, 2006.
[7] F. Padua, R. Carceroni, G. Santos, and K. Kutulakos, "Linear Sequence-to-Sequence Alignment," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 2, pp. 304-320, Feb. 2010.
[8] Y. Caspi and M. Irani, "A Step towards Sequence-to-Sequence Alignment," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 682-689, 2000.
[9] Y. Caspi and M. Irani, "Aligning Non-Overlapping Sequences," Int'l J. Computer Vision, vol. 48, no. 1, pp. 39-51, 2002.
[10] Y. Ukrainitz and M. Irani, "Aligning Sequences and Actions by Maximizing Space-Time Correlations," Proc. European Conf. Computer Vision, pp. 538-550, 2006.
[11] I. Junejo, E. Dexter, I. Laptev, and P. Pérez, "View-Independent Action Recognition from Temporal Self-Similarities," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 172-185, Jan. 2011.
[12] F. Zhou and F. de la Torre, "Canonical Time Warping for Alignment of Human Behavior," Proc. Neural Information Processing Systems, 2009.
[13] A. Veeraraghavan, A. Srivastava, A. Roy-Chowdhury, and R. Chellappa, "Rate-Invariant Recognition of Humans and Their Activities," IEEE Trans. Image Processing, vol. 18, no. 6, pp. 1326-1339, June 2009.
[14] N. Gordon, D. Salmond, and A. Smith, "Novel Approach to Nonlinear/Non-Gaussian Bayesian State Estimation," IEE Proc. F Radar and Signal Processing, vol. 140, pp. 107-113, 1993.
[15] R. Li and R. Chellappa, "Aligning Spatio-Temporal Signals on a Special Manifold," Proc. 11th European Conf. Computer Vision, pp. 547-560, 2010.
[16] S. Maybank, "The Fisher-Rao Metric for Projective Transformations of the Line," Int'l J. Computer Vision, vol. 63, pp. 191-206, 2005.
[17] Y.M. Lui and J.R. Beveridge, "Grassmann Registration Manifolds for Face Recognition," Proc. 10th European Conf. Computer Vision, 2008.
[18] A. Srivastava, I. Jermyn, and S. Joshi, "Riemannian Analysis of Probability Density Functions with Applications in Vision," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2007.
[19] A. Srivastava and E. Klassen, "Bayesian and Geometric Subspace Tracking," Advances in Applied Probability, vol. 36, pp. 43-56, 2004.
[20] Y. Wu, B. Wu, J. Liu, and H. Lu, "Probabilistic Tracking on Riemannian Manifolds," Proc. 19th Int'l Conf. Pattern Recognition, 2008.
[21] J. Kwon, K.M. Lee, and F.C. Park, "Visual Tracking via Geometric Particle Filtering on the Affine Group with Optimal Importance Functions," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[22] F. Porikli and P. Pan, "Regressed Importance Sampling on Manifolds for Efficient Object Tracking," Proc. Sixth IEEE Int'l Conf. Advanced Video and Signal Based Surveillance, pp. 406-411, 2009.
[23] X. Liu, A. Srivastava, and K. Gallivan, "Optimal Linear Representations of Images for Object Recognition," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 5, pp. 662-666, May 2004.
[24] W. Rossmann, Lie Groups: An Introduction through Linear Groups. Oxford Univ. Press, 2003.
[25] T. Arias, A. Edelman, and S. Smith, "The Geometry of Algorithms with Orthogonality Constraints," SIAM J. Matrix Analysis and Applications, vol. 20, pp. 303-353, 1998.
[26] P. Absil, "Optimization on Manifolds: Methods and Applications," Technical Report UCL-INMA-2009.043, 2009.
[27] Y. Chikuse, Statistics on Special Manifold. Springer, 2003.
[28] R. Li, R. Chellappa, and S.K. Zhou, "Learning Multi-Modal Densities on Discriminative Temporal Interaction Manifold for Group Activity Recognition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[29] R. Li and R. Chellappa, "Group Motion Segmentation Using a Spatio-Temporal Driving Force Model," Proc. IEEE Conf. Computer Vision and Pattern Recognition , 2010.
[30] S. Sarkar, P.J. Phillips, Z. Liu, I. Robledo, P. Grother, and K.W. Bowyer, "The Human ID Gait Challenge Problem: Data Sets, Performance, and Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 2, pp. 162-177, Feb. 2005.
[31] C. Schuldt, I. Laptev, and B. Caputo, "Recognizing Human Actions: A Local SVM Approach," Proc. 17th Int'l Conf. Pattern Recognition, pp. 32-36, 2004.
[32] L. Gorelick, M. Blank, E. Shechtman, M. Irani, and R. Basri, "Actions as Space-Time Shapes," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 12, pp. 2247-2253, Dec. 2007.
[33] P.-A. Absil, R. Mahony, and R. Sepulchre, "Riemannian Geometry of Grassmann Manifolds with a View on Algorithmic Computation," Acta Applicandae Mathematicae, vol. 80, pp. 199-220, 2004.
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool