This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Robust Online Appearance Models for Visual Tracking
October 2003 (vol. 25 no. 10)
pp. 1296-1311
Allan D. Jepson, IEEE Computer Society
David J. Fleet, IEEE Computer Society
Thomas F. El-Maraghi, IEEE Computer Society

Abstract—We propose a framework for learning robust, adaptive, appearance models to be used for motion-based tracking of natural objects. The model adapts to slowly changing appearance, and it maintains a natural measure of the stability of the observed image structure during tracking. By identifying stable properties of appearance, we can weight them more heavily for motion estimation, while less stable properties can be proportionately downweighted. The appearance model involves a mixture of stable image structure, learned over long time courses, along with two-frame motion information and an outlier process. An online EM-algorithm is used to adapt the appearance model parameters over time. An implementation of this approach is developed for an appearance model based on the filter responses from a steerable pyramid. This model is used in a motion-based tracking algorithm to provide robustness in the face of image outliers, such as those caused by occlusions, while adapting to natural changes in appearance such as those due to facial expressions or variations in 3D pose.

[1] S.T. Birchfield, Elliptical Head Tracking Using Intensity Gradients and Color Histograms Proc. Conf. Computer Vision and Pattern Recognition, pp. 232-237, 1998.
[2] M.J. Black and A.D. Jepson, EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation Int'l J. Computer Vision, vol. 26, no. 1, pp. 63-84, 1998.
[3] T.J. Cham and J. Rehg, “A Multiple Hypothesis Approach to Figure Tracking,” Proc. Conf. Computer Vision and Pattern Recognition, vol. II, pp. 239–245, June 1999.
[4] D. Comaniciu, V. Ramesh, and P. Meer, Real-Time Tracking of Non-Rigid Objects Using Mean Shift Proc. Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 142-149, 2000.
[5] A.P. Dempster, N.M. Laird, and D.B. Rubin, Maximum Likelihood from Incomplete Data via the EM Algorithm J. Royal Statistical Soc. Series B, vol. 39, pp. 1-38, 1977.
[6] G.J. Edwards, T.F. Cootes, and C.J. Taylor, Face Recognition Using Active Appearance Models Proc. European Conf. Computer Vision, pp. 581-595, 1998.
[7] T.F. El-Maraghi, Robust On-Line Appearance Models for Visual Tracking PhD thesis, Dept. of Computer Science, Univ. of Toronto, 2002.
[8] D.J. Fleet, Measurement of Image Velocity. Norwell, Mass.: Kluwer, 1992.
[9] D.J. Fleet and A.D. Jepson, “Stability of Phase Information,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, pp. 1253–1268, 1993.
[10] D.J. Fleet, A.D. Jepson, and M. Jenkin, Phase-Based Disparity Measurement Computer Vision and Image Understanding, vol. 53, no. 2, pp. 198-210, 1991.
[11] W.T. Freeman and E.H. Adelson, "The Design and Use of Steerable Filters," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, pp. 891-906, 1991.
[12] B. Frey, Filling in Scenes by Propagating Probabilities through Layers into Appearance Models Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 185-192, 2000.
[13] G. Hager and P. Belhumeur, “Efficient Region Tracking with Parametric Models of Geometry and Illumination,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 10, pp. 1025-1039, Oct. 1998.
[14] M. Irani, B. Rousso, and S. Peleg, Computing Occluding and Transparent Motions Int'l J. Computer Vision, vol. 12, no. 1, pp. 5-16, 1994.
[15] M. Isard and A. Blake, Condensation Conditional Density Propagation for Visual Tracking Int'l J. Computer Vision, vol. 29, no. 1, pp. 2-28, 1998.
[16] A.D. Jepson and M. Black, “Mixture Models for Optical Flow Computation,” Proc. Computer Vision and Pattern Recognition, pp. 760-761, June 1993.
[17] A.D. Jepson, D.J. Fleet, and M.J. Black, A Layered Motion Representation with Occlusion and Compact Spatial Support Proc. European Conf. Computer Vision, vol. 1, pp. 692-706, 2002.
[18] N. Jojic and B.J. Frey, Learning Flexible Sprites in Video Layers Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 199-206, 2001.
[19] S.X. Ju, M.J. Black, and A.D. Jepson, Skin and Bones: Multi-Layer, Locally Affine, Optical Flow and Regularization with Transparency Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 307-314, 1996.
[20] D. Koller, K. Daniilidis, T. Thorhallson, and H.-H. Nagel, Model-Based Object Tracking in Traffic Scenes Proc. European Conf. Computer Vision, pp. 437-452, 1992.
[21] D. Kriegman, Personal Comm. 2002.
[22] F. Leymarie and M. Levine, "Tracking deformable objects in the plane using an active contour model, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 15, no. 6, pp. 617-634, June 1993.
[23] F.G. Meyer and P. Bouthemy, Region-Based Tracking Using Affine Motion Models in Long Image Sequences Computer Vision, Graphics, and Image Processing: Image Understanding, vol. 60, no. 2, pp. 119-140, 1994.
[24] D.D. Morris, J.M. Rehg, “Singularity Analysis for Articulated Object Tracking,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 289–296, June 1998.
[25] C.F. Olson, “Maximum-Likelihood Template Matching,” Proc. Computer Vision and Pattern Recognition, 2000.
[26] N. Paragios and R. Deriche, Geodesic Active Contours and Level Sets for the Detection and Tracking of Moving Objects IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, pp. 1-15, 2000.
[27] W. Rucklidge, Efficient Guaranteed Search for Gray-Level Patterns Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 717-723, 1997.
[28] J. Shi and C. Tomasi, Good Features to Track Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 593-600, 1994.
[29] H. Sidenbladh, M.J. Black, and D.J. Fleet, Stochastic Tracking of 3D Human Figures Using 2D Image Motion Proc. European Conf. Computer Vision, vol. 2, pp. 702-718, 2000.
[30] E.P. Simoncelli and W.T. Freeman, “The Steerable Pyramid: A Flexible Architecture for Multi-Scale Derivative Computation,” Proc. Second IEEE Int'l Conf. Image Processing, Oct. 1995.
[31] E.P. Simoncelli, W.T. Freeman, E.H. Adelson, and D.J. Heeger, “Shiftable Multi-Scale Transforms,” IEEE Trans. Information Theory, vol. 38, no. 2, pp. 587-607, Mar. 1992.
[32] C. Stauffer and W.E.L. Grimson, Adaptive Background Mixture Models for Real-Time Tracking Proc. IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 246-252, 1999.
[33] H. Tao, H.S. Sawhney, and R. Kumar, Dynamic Layer Representation with Applications to Tracking Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 134-141, 2000.
[34] Y. Weiss and E. Adelson, Slow and Smooth MIT AI Memo 1624, 1996.
[35] Y. Weiss and D.J. Fleet, Velocity Likelihoods in Biological and Machine Vision Probabilistic Models of the Brain: Perception and Neural Function, R.P.N. Rao, B.A. Olshausen, and M.S. Lewicki, eds., pp. 81-100, Cambridge: MIT Press, 2001.
[36] www.cs.toronto.edu/vis/projectsadaptiveAppearance.html , 2003.

Index Terms:
Motion, optical flow, tracking, occlusion, EM algorithm, adaptive appearance models.
Citation:
Allan D. Jepson, David J. Fleet, Thomas F. El-Maraghi, "Robust Online Appearance Models for Visual Tracking," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1296-1311, Oct. 2003, doi:10.1109/TPAMI.2003.1233903
Usage of this product signifies your acceptance of the Terms of Use.