Subscribe
Issue No.04 - April (2013 vol.35)
pp: 863-881
Xi Li , Australian Centre for Visual Technol., Univ. of Adelaide, Adelaide, SA, Australia
A. Dick , Australian Centre for Visual Technol., Univ. of Adelaide, Adelaide, SA, Australia
Chunhua Shen , Australian Centre for Visual Technol., Univ. of Adelaide, Adelaide, SA, Australia
A. van den Hengel , Australian Centre for Visual Technol., Univ. of Adelaide, Adelaide, SA, Australia
Hanzi Wang , Sch. of Inf. Sci. & Technol., Xiamen Univ., Xiamen, China
ABSTRACT
Visual tracking usually requires an object appearance model that is robust to changing illumination, pose, and other factors encountered in video. Many recent trackers utilize appearance samples in previous frames to form the bases upon which the object appearance model is built. This approach has the following limitations: 1) The bases are data driven, so they can be easily corrupted, and 2) it is difficult to robustly update the bases in challenging situations. In this paper, we construct an appearance model using the 3D discrete cosine transform (3D-DCT). The 3D-DCT is based on a set of cosine basis functions which are determined by the dimensions of the 3D signal and thus independent of the input video data. In addition, the 3D-DCT can generate a compact energy spectrum whose high-frequency coefficients are sparse if the appearance samples are similar. By discarding these high-frequency coefficients, we simultaneously obtain a compact 3D-DCT-based object representation and a signal reconstruction-based similarity measure (reflecting the information loss from signal reconstruction). To efficiently update the object representation, we propose an incremental 3D-DCT algorithm which decomposes the 3D-DCT into successive operations of the 2D discrete cosine transform (2D-DCT) and 1D discrete cosine transform (1D-DCT) on the input video data. As a result, the incremental 3D-DCT algorithm only needs to compute the 2D-DCT for newly added frames as well as the 1D-DCT along the third dimension, which significantly reduces the computational complexity. Based on this incremental 3D-DCT algorithm, we design a discriminative criterion to evaluate the likelihood of a test sample belonging to the foreground object. We then embed the discriminative criterion into a particle filtering framework for object state inference over time. Experimental results demonstrate the effectiveness and robustness of the proposed tracker.
INDEX TERMS
Discrete cosine transforms, Algorithm design and analysis, Visualization, Robustness, Loss measurement, Image reconstruction, Adaptation models,template matching, Visual tracking, appearance model, compact representation, discrete cosine transform (DCT), incremental learning
CITATION
Xi Li, A. Dick, Chunhua Shen, A. van den Hengel, Hanzi Wang, "Incremental Learning of 3D-DCT Compact Representations for Robust Visual Tracking", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 4, pp. 863-881, April 2013, doi:10.1109/TPAMI.2012.166
REFERENCES
 [1] D.A. Ross, J. Lim, R. Lin, and M. Yang, "Incremental Learning for Robust Visual Tracking," Int'l J. Computer Vision, vol. 77, no. 1, pp. 125-141, 2008. [2] X. Li, W. Hu, Z. Zhang, X. Zhang, M. Zhu, and J. Cheng, "Visual Tracking via Incremental Log-Euclidean Riemannian Subspace Learning," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008. [3] A.K. Jain, Fundamentals of Digital Image Processing. Prentice Hall, Inc., 1989. [4] S.A. Khayam, "The Discrete Cosine Transform (DCT): Theory and Application," technical report, Michigan State Univ., 2003. [5] Z.M. Hafed and M.D. Levine, "Face Recognition Using the Discrete Cosine Transform," Int'l J. Computer Vision, vol. 43, no. 3, pp. 167-188, 2001. [6] G. Feng and J. Jiang, "Jpeg Compressed Image Retrieval via Statistical Features," Pattern Recognition, vol. 36, no. 4, pp. 977-985, 2003. [7] D. He, Z. Gu, and N. Cercone, "Efficient Image Retrieval in DCT Domain Using Hypothesis Testing," Proc. Int'l Conf. Image Processing, pp. 225-228, 2009. [8] D. Chen, Q. Liu, M. Sun, and J. Yang, "Mining Appearance Models Directly from Compressed Video," IEEE Trans. Multimedia, vol. 10, no. 2, pp. 268-276, Feb. 2008. [9] Y. Zhong, H. Zhang, and A.K. Jain, "Automatic Caption Localization in Compressed Video," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 4, pp. 385-392, Apr. 2000. [10] A. Adam, E. Rivlin, and I. Shimshoni, "Robust Fragments-Based Tracking Using the Integral Histogram," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 798-805, 2006. [11] C. Shen, J. Kim, and H. Wang, "Generalized Kernel-Based Visual Tracking," IEEE Trans. Circuits and Systems for Video Technology, vol. 20, no. 1, pp. 119-130, Jan. 2010. [12] H. Wang, D. Suter, K. Schindler, and C. Shen, "Adaptive Object Tracking Based on an Effective Appearance Filter," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 9, pp. 1661-1667, Sept. 2007. [13] A.D. Jepson, D.J. Fleet, and T.F. El-Maraghi, "Robust Online Appearance Models for Visual Tracking," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 415-422, 2001. [14] X. Mei and H. Ling, "Robust Visual Tracking and Vehicle Classification via Sparse Representation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 33, no. 11, pp. 2259-2272, Nov. 2011. [15] B. Liu, L. Yang, J. Huang, P. Meer, L. Gong, and C. Kulikowski, "Robust and Fast Collaborative Tracking with Two Stage Sparse Optimization," Proc. European Conf. Computer Vision, 2010. [16] B. Liu, J. Huang, C. Kulikowski, and L. Yang, "Robust Tracking Using Local Sparse Appearance Model and K-Selection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011. [17] H. Li, C. Shen, and Q. Shi, "Real-Time Visual Tracking with Compressed Sensing," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011. [18] J. Kwon and K.M. Lee, "Visual Tracking Decomposition," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1269-1276, 2010. [19] F. Porikli, O. Tuzel, and P. Meer, "Covariance Tracking Using Model Update Based on Lie Algebra," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 728-735, 2006. [20] Y. Wu, J. Cheng, J. Wang, and H. Lu, "Real-Time Visual Tracking via Incremental Covariance Tensor Learning," Proc. 12th IEEE Int'l Conf. Computer Vision, pp. 1631-1638, 2009. [21] D. Comaniciu, V. Ramesh, and P. Meer, "Kernel-Based Object Tracking," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 564-577, May 2003. [22] C. Shen, M.J. Brooks, and A. van den Hengel, "Fast Global Kernel Density Mode Seeking: Applications to Localization and Tracking," IEEE Trans. Image Processing, vol. 16, no. 5, pp. 1457-1469, May 2007. [23] W. Qu and D. Schonfeld, "Robust Control-Based Object Tracking," IEEE Trans. Image Processing, vol. 17, no. 9, pp. 1721-1726, Sept. 2008. [24] S. Avidan, "Support Vector Tracking," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 26, no. 8, pp. 1064-1072, Aug. 2004. [25] M. Tian, W. Zhang, and F. Liu, "On-Line Ensemble SVM for Robust Object Tracking," Proc. Asian Conf. Computer Vision, pp. 355-364, 2007. [26] F. Tang, S. Brennan, Q. Zhao, and H. Tao, "Co-Tracking Using Semi-Supervised Support Vector Machines," Proc. 11th IEEE Int'l Conf. Computer Vision, 2007. [27] H. Grabner, M. Grabner, and H. Bischof, "Real-Time Tracking via On-Line Boosting," Proc. British Machine Vision Conf., pp. 47-56, 2006. [28] H. Grabner, C. Leistner, and H. Bischof, "Semi-Supervised On-Line Boosting for Robust Tracking," Proc. European Conf. Computer Vision, pp. 234-247, 2008. [29] R.T. Collins, Y. Liu, and M. Leordeanu, "Online Selection of Discriminative Tracking Features," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 10, pp. 1631-1643, Oct. 2005. [30] J. Santner, C. Leistner, A. Saffari, T. Pock, and H. Bischof, "PROST: Parallel Robust Online Simple Tracking," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 723-730, 2010. [31] B. Babenko, M. Yang, and S. Belongie, "Visual Tracking with Online Multiple Instance Learning," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 983-990, 2009. [32] J. Fan, Y. Wu, and S. Dai, "Discriminative Spatial Attention for Robust Tracking," Proc. European Conf. Computer Vision, pp. 480-493, 2010. [33] X. Wang, G. Hua, and T.X. Han, "Discriminative Tracking by Metric Learning," Proc. European Conf. Computer Vision, pp. 200-214, 2010. [34] N. Jiang, W. Liu, and Y. Wu, "Learning Adaptive Metric for Robust Visual Tracking," IEEE Trans. Image Processing, vol. 20, no. 8, pp. 2288-2300, Aug. 2011. [35] M. Yang, Z. Fan, J. Fan, and Y. Wu, "Tracking Non-Stationary Visual Appearances by Data-Driven Adaptation," IEEE Trans. Image Processing, vol. 18, no. 7, pp. 1633-1644, July 2009. [36] X. Liu and T. Yu, "Gradient Feature Selection for Online Boosting," Proc. 11th IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007. [37] S. Avidan, "Ensemble Tracking," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 2, pp. 261-271, Feb. 2007. [38] L.D. Lathauwer, B.D. Moor, and J. Vandewalle, "On the Best Rank-1 and Rank-$(r_{1},r_{2},\ldots, r_{n})$ Approximation of Higher-Order Tensors," SIAM J. Matrix Analysis and Applications, vol. 21, no. 4, pp. 1324-1342, 2000. [39] J. Wang, J. Yang, K. Yu, F. Lv, T. Huang, and Y. Gong, "Locality-Constrained Linear Coding for Image Classification," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 3360-3367, 2010. [40] M. Isard and A. Blake, "Contour Tracking by Stochastic Propagation of Conditional Density," Proc. European Conf. Computer Vision, pp. 343-356, 1996. [41] http://www.cs.toronto.edu/~drossivt/, 2012. [42] http://vision.ucsd.edu/~bbabenkoproject_miltrack.shtml , 2012. [43] http://cv.snu.ac.kr/research~vtd/, 2012. [44] http://homepages.inf.ed.ac.uk/rbfcaviardata1 /, 2012. [45] http://www.cs.technion.ac.il/~amita/fragtrack fragtrack.htm, 2012. [46] http://i21www.ira.uka.deimage_sequences/, 2012. [47] http://www.hitech-projects.com/euprojects/ cantata/ datasets_cantatadataset. html , 2012.