This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
A Field Model for Human Detection and Tracking
May 2006 (vol. 28 no. 5)
pp. 753-765
Ying Wu, IEEE
Ting Yu, IEEE
The large shape variability and partial occlusions challenge most object detection and tracking methods for nonrigid targets such as pedestrians. This paper presents a new approach based on a two-layer statistical field model that characterizes the prior of the complex shape variations as a Boltzmann distribution and embeds this prior and the complex image likelihood into a Markov field. A probabilistic variational analysis of this model reveals a set of fixed-point equations characterizing the equilibrium of the field. It leads to computationally efficient methods for calculating the image likelihood and for training the model. Based on that, effective algorithms for detecting nonrigid objects are developed. This new approach has several advantages. First, it is intrinsically suitable for capturing local nonrigidity. In addition, due to the distributed likelihood, this approach is robust to partial occlusions. Moreover, the two-layer structure provides large flexibility of modeling the image observations, which makes the new method robust to clutters. Extensive experiments demonstrate its effectiveness.

[1] S. Belongie, J. Malik, and J. Puzicha, “Shape Matching and Object Recognition Using Shape Contexts,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, pp. 509-522, 2002.
[2] A. Blake and M. Isard, Active Contours. Springer-Verlag, 1998.
[3] H. Chui and A. Rangarajan, “A New Algorithm for Nonrigid Point Matching,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 44-51, June 2000.
[4] R. Collins, A. Lipton, and T. Kanade, “Special Issue on Video Surveillance and Monitoring,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, pp. 745-746, 2000.
[5] T.F. Cootes, C.J. Taylor, and J. Graham, “Active Shape Models — Their Training and Application,” Computer Vision and Image Understanding, vol. 61, pp. 38-59, Jan. 1995.
[6] J. Coughlan and S. Ferreira, “Finding Deformable Shapes Using Loopy Belief Propagation,” Proc. European Conf. Computer Vision, vol. 3, pp. 453-468, 2002.
[7] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 886-893, June 2005.
[8] L. Davis, I. Haritaouglu, and D. Harwood, “Ghost: A Human Body Part Labeling System Using Silhouettes,” Proc. Int'l Conf. Pattern Recognition, vol. 1, pp. 77-82, 1998.
[9] W. Freeman, E. Pasztor, and O. Carmichael, “Learning Low-Level Vision,” Int'l J. Computer Vision, vol. 40, pp. 25-47, 2000.
[10] D.M. Gavrila, “The Visual Analysis of Human Movement: A Survey,” Computer Vision and Image Understanding, vol. 73, pp. 82-98, Jan. 1999.
[11] D.M. Gavrila and V. Philomin, “Real-Time Object Detection for ‘Smart’ Vehicles,” Proc. IEEE Int'l Conf. Computer Vision, pp. 87-93, Sept. 1999.
[12] D. Geiger and F. Girosi, “Parallel and Determinstic Algorithms from MRFs: Surface Reconstruction,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 13, pp. 401-412, 1991.
[13] S. Geman and D. Geman, “Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 6, pp. 721-741, 1984.
[14] M. Isard and A. Blake, “Contour Tracking by Stochastic Propagation of Conditional Density,” Proc. European Conf. Computer Vision, pp. 343-356, 1996.
[15] T.S. Jaakkola, “Tutorial on Variational Approximation Methods,” technical report, MIT Artificial Intelligence Lab., 2000.
[16] N. Jojic, N. Petrovic, B. Frey, and T.S. Huang, “Transformed Hidden Markov Models: Estimating Mixture Models and Inferring Spatial Transformations in Video Sequences,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 26-33, June 2000.
[17] M. Jordan, Z. Ghahramani, T. Jaakkola, and L. Saul, “An Introduction to Variational Methods for Graphical Models,” Machine Learning, vol. 37, pp. 183-233, 2000.
[18] M. Kass, A. Witkin, and D. Terzopoulos, “Snake: Active Contour Models,” Proc. Int'l Conf. Computer Vision, pp. 259-268, 1987.
[19] B. Leibe, E. Seemann, and B. Schiele, “Pedestrian Detection in Crowded Scenes,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 878-885, June 2005.
[20] C. Liu, S.C. Zhu, and H.-Y. Shum, “Learning Inhomogeneous Gibbs Model of Faces by Minimax Entropy,” Proc. IEEE Int'l Conf. Computer Vision, vol. 1, pp. 281-287, July 2001.
[21] J. MacCormick and A. Blake, “A Probabilistic Exclusion Principle for Tracking Multiple Objects,” Proc. IEEE Int'l Conf. Computer Vision, pp. 572-578, 1999.
[22] A. Mohan, C. Papageorgiou, and T. Poggio, “Example-Based Object Detection in Images by Components,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 4, pp. 349-361, Apr. 2001.
[23] M. Oren, C. Papageorgiou, P. Sinha, E. Osuna, and T. Poggio, “Pedestrian Detection Using Wavelet Templates,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 193-199, 1997.
[24] E. Osuna, R. Freund, and F. Girosi, “Training Support Vector Machines: An Application to Face Detection,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 130-136, 1997.
[25] C. Papageorgiou and T. Poggio, “A Trainable System for Object Detection,” Int'l J. Computer Vision, vol. 38, pp. 15-33, 2000.
[26] V. Pavlović, R. Sharma, and T.S. Huang, “Visual Interpretation of Hand Gestures for Human Computer Interaction: A Review,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 677-695, July 1997.
[27] A. Pentland, “Looking at People: Sensing for Ubiquitous and Wearable Computing,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 107-119, Jan. 2000.
[28] C. Peterson and J. anderson, “A Mean Field Theory Learning Algorithm for Neural Networks,” Complex Systems, pp. 995-1019, 1987.
[29] D. Ramanan and D. forsyth, “Finding and Tracking People from the Bottom Up,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 467-474, June 2003.
[30] A. Rangarajan, J. Coughlan, and A. Yuille, “A Bayesian Network Framework for Relational Shape Matching,” Proc. IEEE Int'l Conf. Computer Vision, vol. 1, pp. 671-678, Oct. 2003.
[31] H. Rowley, S. Baluja, and T. Kanade, “Neural Network-Based Face Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. 23-38, Jan. 1998.
[32] H. Schneiderman and T. Kanade, “A Statistical Method for 3D Object Detection Applied to Faces and Cars,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 746-751, 2000.
[33] S. Sclaroff and A. Pentland, “Modal Matching for Correspondence and Recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 17, pp. 545-561, 1995.
[34] K. Toyama and A. Blake, “Probabilistic Tracking in a Metric Space,” Proc. IEEE Int'l Conf. Computer Vision, vol. 2, pp. 50-57, July 2001.
[35] P. Viola and M. Jones, “Rapid Object Detection Using A Boosted Cascade of Simple Features,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 511-518, Dec. 2001.
[36] P. Viola, M. Jones, and D. Snow, “Detecting Pedestrians Using Patterns of Motion and Appearance,” Proc. IEEE Int'l Conf. Computer Vision, pp. 734-741, Oct. 2003.
[37] Y. Weiss, “Correctness of Local Probability Propagation in Graphical Models with Loops,” Neural Computation, vol. 12, pp. 1-41, 2000.
[38] Y. Wu, T. Yu, and G. Hua, “A Statistical Field Model for Pedestrian Detection,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 1023-1030, June 2005.
[39] A. Yuille, “Deformable Templates for Face Recognition,” J. Cognitive Neuroscience, vol. 3, pp. 59-70, 1991.
[40] S.C. Zhu, Y.N. Wu, and D.B. Mumford, “FRAME: Filters, Random Field and Maximum Entropy— Towards a Unified Theory for Texture Modeling,” Int'l J. Computer Vision, vol. 27, pp. 1-20, 1998.

Index Terms:
Object detection, shape, Markov random fields, image models, machine learning, statistical computing, probabilistic algorithms.
Citation:
Ying Wu, Ting Yu, "A Field Model for Human Detection and Tracking," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 5, pp. 753-765, May 2006, doi:10.1109/TPAMI.2006.87
Usage of this product signifies your acceptance of the Terms of Use.