The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.02 - Feb. (2014 vol.36)
pp: 361-374
Xiaogang Wang , Dept. of Electron. Eng., Chinese Univ. of Hong Kong, Hong Kong, China
Meng Wang , Dept. of Electron. Eng., Chinese Univ. of Hong Kong, Hong Kong, China
Wei Li , Dept. of Electron. Eng., Chinese Univ. of Hong Kong, Hong Kong, China
ABSTRACT
The performance of a generic pedestrian detector may drop significantly when it is applied to a specific scene due to the mismatch between the source training set and samples from the target scene. We propose a new approach of automatically transferring a generic pedestrian detector to a scene-specific detector in static video surveillance without manually labeling samples from the target scene. The proposed transfer learning framework consists of four steps. 1) Through exploring the indegrees from target samples to source samples on a visual affinity graph, the source samples are weighted to match the distribution of target samples. 2) It explores a set of context cues to automatically select samples from the target scene, predicts their labels, and computes confidence scores to guide transfer learning. 3) The confidence scores propagate among target samples according to their underlying visual structures. 4) Target samples with higher confidence scores have larger influence on training scene-specific detectors. All these considerations are formulated under a single objective function called confidence-encoded SVM, which avoids hard thresholding on confidence scores. During test, only the appearance-based detector is used without context cues. The effectiveness is demonstrated through experiments on two video surveillance data sets. Compared with a generic detector, it improves the detection rates by 48 and 36 percent at one false positive per image (FPPI) on the two data sets, respectively. The training process converges after one or two iterations on the data sets in experiments.
INDEX TERMS
video surveillance, Pedestrian detection, transfer learning, confidence-encoded SVM, domain adaptation,
CITATION
Xiaogang Wang, Meng Wang, Wei Li, "Scene-Specific Pedestrian Detection for Static Video Surveillance", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.36, no. 2, pp. 361-374, Feb. 2014, doi:10.1109/TPAMI.2013.124
REFERENCES
[1] C. Nakajima, M. Pontil, B. Heisele, and T. Poggio, "Full-Body Recognition System," Pattern Recognition, vol. 36, pp. 1977-2006, 2003.
[2] B. Bose, X. Wang, and W.E.L. Grimson, "Multi-Class Object Tracking Algorithm that Handles Fragmentation and Grouping," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2007.
[3] T. Zhao, R. Nevatia, and B. Wu, "Segmentation and Tracking of Multiple Humans in Crowded Environments," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 7, pp. 1198-1211, July 2008.
[4] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2005.
[5] B. Wu and R. Nevatia, "Detection of Multiple Partially Occluded Humans in a Single Image by Bayesian Combination of Edgelet Part Detectors," Proc. 10th IEEE Int'l Conf. Computer Vision (ICCV), 2005.
[6] Z. Lin, L. Davis, D. Doermann, and D. Dementhon, "Hierarchical Part-Template Matching for Human Detection and Segmentation," Proc. 11th IEEE Int'l Conf. Computer Vision (ICCV), 2007.
[7] P. Sabzmeydani and G. Mori, "Detecting Pedestrians by Learning Shapelet Features," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2007.
[8] P. Felzenszwalb and D. McAllester, "A Discriminatively Trained Multiscale, Deformable Part Model," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2008.
[9] O. Tuzel, F. Porikli, and P. Meer, "Pedestrian Detection via Classification on Riemannian Manifolds," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 10, pp. 1713-1727, Oct. 2008.
[10] X. Wang, X. Han, and S. Yan, "An Hog-LBP Human Detector with Partial Occlusion Handling," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2009.
[11] C. Wojek, S. Walk, and B. Schiele, "Multi-Cue Onboard Pedestrian Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2009.
[12] W.R. Schwartz, A. Kembhavi, D. Harwood, and L.S. Davis, "Human Detection Using Partial Least Squares Analysis," Proc. 12th IEEE Int'l Conf. Computer Vision (ICCV), 2009.
[13] W. Ouyang and X. Wang, "A Discriminative Deep Model for Pedestrian Detection with Occlusion Handling," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012.
[14] W. Ouyang and X. Wang, "Single-Pedestrian Detection Aided by Multi-Pedestrian Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2013.
[15] M. Enzweiler and D.M. Gavrila, "Monocular Pedestrian Detection: Survey and Experiments," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 12, pp. 2179-2195, Dec. 2009.
[16] L. Bourdev and J. Malik, "Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations," Proc. 12th IEEE Int'l Conf. Computer Vision (ICCV), 2009.
[17] M. Enzweiler, A. Eigenstetter, B. Schiele, and D.M. Gavrila, "Multi-Cue Pedestrian Classification with Partial Occlusion Handling," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010.
[18] D. Geronimo, A.M. Lopez, A.D. Sappa, and T. Graf, "Survey of Pedestrian Detection for Advanced Driver Assistance Systems," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 7, pp. 1239-1258, July 2010.
[19] S. Walk, N. Majer, K. Schindler, and B. Schiele, "New Features and Insights for Pedestrian Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2010.
[20] P. Dollar, B.C. Wojek, B. Schiele, and P. Perona, "Pedestrian Detection: An Evaluation of the State of the Art," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 734-761, Apr. 2012.
[21] X. Wang, X. Ma, and E. Grimson, "Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 3, pp. 539-555, Mar. 2008.
[22] A. Levin, P. Viola, and Y. Freund, "Unsupervised Improvement of Visual Detectors Using Co-Training," Proc. Ninth IEEE Int'l Conf. Computer Vision (ICCV), 2003.
[23] C. Rosenberg, M. Hebert, and H. Schneiderman, "Semi-Supervised Self-Training of Object Detection Models," Proc. IEEE Workshop Application of Computer Vision, 2005.
[24] B. Wu and R. Nevatia, "Improving Part Based Object Detection by Unsupervised Online Boosting," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2007.
[25] P.M. Roth, S. Sternig, H. Grabner, and H. Bischof, "Classifier Grids for Robust Adaptive Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2009.
[26] M. Wang and X. Wang, "Automatic Adaptation of a Generic Pedestrian Detector to a Specific Traffic Scene," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[27] M. Wang, W. Li, and X. Wang, "Transferring a Generic Pedestrian Detector Towards Specific Scenes," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012.
[28] J. Pang, Q. Huang, S. Yan, S. Jiang, and L. Qin, "Transferring Boosted Detectors towards Viewpoint and Scene Adaptiveness," IEEE Trans. Image Processing, vol. 20, no. 5, pp. 1388-1400, May 2011.
[29] V. Nair and J.J. Clark, "An Unsupervised Online Learning Framework for Moving Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2004.
[30] O. Javed, S. Ali, and M. Shah, "Online Detection and Classification of Moving Objects Using Progressively Improving Detectors," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2005.
[31] P.M. Roth, H. Grabner, D. Skocaj, H. Bishof, and A. Leonardis, "On-Line Conservative Learning for Person Detection," Proc. IEEE Int'l Workshop Visual Surveillance and Performance Evaluation of Tracking and Surveillance (PETS), 2005.
[32] N. Dalal, B. Triggs, and C. Schmid, "Human Detection Using Oriented Histogram of Flow and Appearance," Proc. Ninth European Conf. Computer Vision (ECCV), 2006.
[33] B. Kulis, K. Saenko, and T. Darrell, "What You Saw Is Not What You Get: Domain Adaptation Using Asymmetric Kernel Transforms," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[34] G. Qi, C. Aggarwal, Y. Rui, Q. Tian, S. Chang, and T. Huang, "Towards Cross-Category Knowledge Propagation for lEarning Visual Concepts," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[35] J. Liu, M. Shah, B. Kuipers, and S. Savarese, "Cross-View Action Recognition via View Knowledge Transfer," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[36] J. Yang, R. Yan, and A.G. Hauptmann, "Cross-Domain Video Concept Detection Using Adaptive SVMs," Proc. 15th ACM Int'l Conf. Multimedia (Multimedia), 2007.
[37] G. Qi, C. Aggarwal, and T. Huang, "Towards Semantic Knowledge Propagation from Text Corpus to Web Images," Proc. 20th Int'l Conf. World Wide Web, 2011.
[38] L. Duan, I.W. Tsang, D. Xu, and S.J. Maybank, "Domain Transfer SVM for Video Concept Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2009.
[39] W. Jiang, E. Zavesky, S. Chang, and A. Loui, "Cross-Domain Learning Methods for High-Level Visual Concept Classification," Proc. 15th IEEE Int'l Conf. Image Processing (ICIP), 2008.
[40] W. Dai, Q. Yang, and G.R. Xue, "Boosting for Transfer Learning," Proc. 24th Int'l Conf. Machine Learning (ICML), 2007.
[41] X. Wu and R. Srihari, "Incorporating Prior Knowledge with Weighted Margin Support Vector Machines," Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2004.
[42] S. Stalder and H. Grabner, "Cascaded Confidence Filtering for Improved Tracking-by-Detection," Proc. 11th European Conf. Computer Vision (ECCV), 2010.
[43] K. Ali, D. Hasler, and F. Fleuret, "FlowBoost—Appearance Learning from Sparsely Annotated Video," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[44] V. Jain and E. Learned-Miller, "Online Domain Adaptation of a Pre-Trained Cascade of Classifiers," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2011.
[45] P.F. Felzenszwalb, R.B. Girshick, D. McAllester, and D. Ramanan, "Object Detection with Discriminatively Trained Part-Based Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1627-1645, Sept. 2010.
[46] D. Comaniciu and P. Meer, "Mean Shift: A Robust Approach Toward Feature Space Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, May 2002.
[47] D. Hoiem, A. Efros, and M. Hebert, "Putting Objects in Perspective," Int'l J. Computer Vision, vol. 80, no. 1, pp. 3-15, Apr. 2008.
[48] C. Tomasi and T. Kanade, "Detection and Tracking of Point Features," Technical Report, School of Computer Science, Carnegie Mellon Univ., Apr. 1991.
[49] A. Mislove, M. Marcon, K. Gummadi, P. Druschel, and B. Bhattacharjee, "Measurement and Analysis of Online Social Networks," Proc. Seventh ACM SIGCOMM Conf. Internet Measurement, 2007.
[50] R.E. Fan, K.W. Chang, and C.J. Hsieh, "LIBLINEAR: A Library for Large Linear Classification," J. Machine Learning Research, vol. 9, pp. 1871-1874, 2008.
[51] Y. Bengio, J. Louradour, R. Collobert, and J. Weston, "Curriculum Learning," Proc. Int'l Conf. Machine Learning (ICML), 2009.
[52] M.P. Kumar, B. Packer, and D. Koller, "Self-Paced Learning for Latent Variable Models," Proc. Conf. Neural Information Processing Systems (NIPS), 2010.
[53] H. DauméIII, A. Kumar, and A. Saha, "Frustratingly Easy Semi-Supervised Domain Adaptation," Proc. Workshop Domain Adaptation for Natural Language Processing, 2010.
[54] P. Dollár, C. Wojek, and B. Schiele, "Pedestrian Detection: A Benchmark," Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2009.
[55] R. Benenson, M. Mathias, R. Timofte, and L. Van Gool, "Pedestrian Detection at 100 Frames Per Second," Proc. Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012.
56 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool