This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Multistage Particle Windows for Fast and Accurate Object Detection
Aug. 2012 (vol. 34 no. 8)
pp. 1589-1604
A. Prati, Dept. of Eng. Sci. & Methods, Univ. of Modena & Reggio Emilia, Reggio Emilia, Italy
G. Gualdi, Dept. of Inf. Eng., Univ. of Modena & Reggio Emilia, Modena, Italy
R. Cucchiara, Dept. of Inf. Eng., Univ. of Modena & Reggio Emilia, Modena, Italy
The common paradigm employed for object detection is the sliding window (SW) search. This approach generates grid-distributed patches, at all possible positions and sizes, which are evaluated by a binary classifier: The tradeoff between computational burden and detection accuracy is the real critical point of sliding windows; several methods have been proposed to speed up the search such as adding complementary features. We propose a paradigm that differs from any previous approach since it casts object detection into a statistical-based search using a Monte Carlo sampling for estimating the likelihood density function with Gaussian kernels. The estimation relies on a multistage strategy where the proposal distribution is progressively refined by taking into account the feedback of the classifiers. The method can be easily plugged into a Bayesian-recursive framework to exploit the temporal coherency of the target objects in videos. Several tests on pedestrian and face detection, both on images and videos, with different types of classifiers (cascade of boosted classifiers, soft cascades, and SVM) and features (covariance matrices, Haar-like features, integral channel features, and histogram of oriented gradients) demonstrate that the proposed method provides higher detection rates and accuracy as well as a lower computational burden w.r.t. sliding window detection.

[1] L. Breiman, "Bagging Predictors," Machine Learning, vol. 24, pp. 123-140, 1996.
[2] Y. Freund and R.E. Schapire, "Experiments with a New Boosting Algorithm," Proc. 13th Int'l Conf. Machine Learning, pp. 148-156, 1996.
[3] E. Bauer and R. Kohavi, "An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants," Machine Learning, vol. 36, pp. 105-139, 1999.
[4] L. Fei-Fei and P. Perona, "A Bayesian Hierarchical Model for Learning Natural Scene Categories," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 524-531, 2005.
[5] M. Enzweiler and D. Gavrila, "Monocular Pedestrian Detection: Survey and Experiments," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 12, pp. 2179-2195, Dec. 2009.
[6] A.M. Treisman and G. Gelade, "A Feature-Integration Theory of Attention," Cognitive Psychology, vol. 12, no. 1, pp. 97-136, 1980.
[7] P. Viola, M. Jones, and D. Snow, "Detecting Pedestrians Using Patterns of Motion and Appearance," Int'l J. Computer Vision, vol. 63, no. 2, pp. 153-161, 2005.
[8] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 886-893, June 2005.
[9] O. Tuzel, F. Porikli, and P. Meer, "Pedestrian Detection via Classification on Riemannian Manifolds," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 10, pp. 1713-1727, Oct. 2008.
[10] P. Viola and M.J. Jones, "Robust Real-Time Face Detection," Int'l J. Computer Vision, vol. 57, pp. 137-154, 2004.
[11] R. Verschae, J. Ruiz-del Solar, and M. Correa, "A Unified Learning Framework for Object Detection and Classification Using Nested Cascades of Boosted Classifiers," Machine Vision and Applications, vol. 19, pp. 85-103, 2008.
[12] S. Brubaker, J. Wu, J. Sun, M. Mullin, and J. Rehg, "On the Design of Cascades of Boosted Ensembles for Face Detection," Int'l J. Computer Vision, vol. 77, pp. 65-86, 2008.
[13] L. Itti and C. Koch, "A Saliency-Based Search Mechanism for Overt and Covert Shifts of Visual Attention," Vision Research, vol. 40, nos. 10-12, pp. 1489-1506, 2000.
[14] L. Zhang, M.H. Tong, and G.W. Cottrell, "Information Attracts Attention: A Probabilistic Account of the Cross-Race Advantage in Visual Search," Proc. 29th Ann. Cognitive Science Conf., 2007.
[15] N. Butko and J. Movellan, "Optimal Scanning for Faster Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 2751-2758, 2009.
[16] W. Zhang, G. Zelinsky, and D. Samaras, "Real-Time Accurate Object Detection Using Multiple Resolutions," Proc. IEEE Int'l Conf. Computer Vision, pp. 1-8, 2007.
[17] A. Oliva and A. Torralba, "Building the Gist of a Scene: The Role of Global Image Features in Recognition," Visual Perception: Fundamentals of Awareness: Multi-Sensory Integration and High-Order Perception, S. Martinez-Conde, S. Macknik, L. Martinez, J.-M. Alonso, and P. Tse, eds., chapter 2, vol. 155, Part 2, pp. 23-36, Elsevier, 2006.
[18] M. Pedersoli, J. Gonzàlez, A.D. Bagdanov, and J.J. Villanueva, "Recursive Coarse-to-Fine Localization for Fast Object Detection," Proc. 11th European Conf. Computer Vision, pp. 280-293, 2010.
[19] F. Fleuret and D. Geman, "Coarse-to-Fine Face Detection," Int'l J. Computer Vision, vol. 41, nos. 1/2, pp. 85-107, 2001.
[20] P. Dollár, S. Belongie, and P. Perona, "The Fastest Pedestrian Detector in the West," Proc. British Machine Vision Conf., pp. 1-11, 2010.
[21] P. Dollár, Z. Tu, P. Perona, and S. Belongie, "Integral Channel Features," Proc. British Machine Vision Conf., 2009.
[22] D. Comaniciu, V. Ramesh, and P. Meer, "Kernel-Based Object Tracking," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 564-575, May 2003.
[23] B. Han, Y. Zhu, D. Comaniciu, and L.S. Davis, "Visual Tracking by Continuous Density Propagation in Sequential Bayesian Filtering Framework," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 5, pp. 919-930, May 2009.
[24] M. Isard and A. Blake, "Condensation—Conditional Density Propagation for Visual Tracking," Int'l J. Computer Vision, vol. 29, no. 1, pp. 5-28, 1998.
[25] R. Lienhart, A. Kuranov, and V. Pisarevsky, "Empirical Analysis of Detection Cascades of Boosted Classifiers for Rapid Object Detection," Proc. German Pattern Recognition Symp., pp. 297-304, 2003.
[26] B. Froba and A. Ernst, "Face Detection with the Modified Census Transform," Proc. Sixth IEEE Int'l Conf. Automatic Face and Gesture Recognition, pp. 91-96, 2004.
[27] J. Wu, M.D. Mullin, and J.M. Rehg, "Linear Asymmetric Classifier for Cascade Detectors," Proc. Int'l Conf. Machine Learning, pp. 988-995, 2005.
[28] T.B. Dinh, V.B. Dang, D.A. Duong, T.T. Nguyen, and D.-D. Le, "Hand Gesture Classification Using Boosted Cascade of Classifiers," Proc. Int'l Conf. Research, Innovation, and Vision for the Future, pp. 139-144, Feb. 2006.
[29] Q. Zhu, M.-C. Yeh, K.-T. Cheng, and S. Avidan, "Fast Human Detection Using a Cascade of Histograms of Oriented Gradients," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 1491-1498, 2006.
[30] S. Paisitkriangkrai, C. Shen, and J. Zhang, "Fast Pedestrian Detection Using a Cascade of Boosted Covariance Features," IEEE Trans. Circuits and Systems for Video Technology, vol. 18, no. 8, pp. 1140-1151, Aug. 2008.
[31] S. Munder and D. Gavrila, "An Experimental Study on Pedestrian Classification," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 1863-1868, Nov. 2006.
[32] W. Zhang, R. Tong, and J. Dong, "Boosted Cascade of Scattered Rectangle Features for Object Detection," Science in China Series F: Information Sciences, vol. 52, pp. 236-243, 2009.
[33] R.N. Hota, K. Jonna, and P.R. Krishna, "On-Road Vehicle Detection by Cascaded Classifiers," COMPUTE 10: Proc. Third Ann. ACM Bangalore Conf., pp. 1-5, 2010.
[34] P. Dollar, C. Wojek, B. Schiele, and P. Perona, "Pedestrian Detection: A Benchmark," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 304-311, June 2009.
[35] L. Breiman, J. Friedman, C. Stone, and R. Olshen, Classification and Regression Trees. CRC Press, 1984.
[36] T. Ojala, M. Pietikäinen, and D. Harwood, "A Comparative Study of Texture Measures with Classification Based on Featured Distributions," Pattern Recognition, vol. 29, no. 1, pp. 51-59, 1996.
[37] R. Fisher, "The Use of Multiple Measurements in Taxonomic Problems," Annals of Eugenics, vol. 7, pp. 179-188, 1936.
[38] C. Wohler and J. Anlauf, "An Adaptable Time-Delay Neural-Network Algorithm for Image Sequence Analysis," IEEE Trans. Neural Networks, vol. 10, no. 6, pp. 1531-1536, Nov. 1999.
[39] A. Lehmann, B. Leibe, and L.V. Gool, "Feature-Centric Efficient Subwindow Search," Proc. IEEE Int'l Conf. Computer Vision, Oct. 2009.
[40] E.-J. Ong and R. Bowden, "A Boosted Classifier Tree for Hand Shape Detection," Proc. IEEE Int'l Conf. Automatic Face and Gesture Recognition, p. 889, 2004.
[41] Z. Zhang1, M. Li, S.Z. Li, and H. Zhang, "Multi-View Face Detection with Floatboost," Proc. IEEE Workshop Applications of Computer Vision, p. 184, 2002.
[42] P. Viola and M. Jones, "Fast and Robust Classification Using Asymmetric Adaboost and a Detector Cascade," Proc. Neural Information Processing Systems, vol. 14, pp. 1311-1318, 2001.
[43] M. Arenas, J. Ruiz-del Solar, and R. Verschae, "Detection of Aibo and Humanoid Robots Using Cascades of Boosted Classifiers," RoboCup 2007: Robot Soccer World Cup XI, pp. 449-456, Springer-Verlag, 2008.
[44] P.F. Felzenszwalb, R.B. Girshick, and D. Mcallester, "Cascade Object Detection with Deformable Part Models," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2010.
[45] J. Friedman, T. Hastie, and R. Tibshirani, "Additive Logistic Regression: A Statistical View of Boosting," Annals of Statistics, vol. 28, no. 2, pp. 337-407, 2000.
[46] S. Brubaker, M. Mullin, and J. Rehg, "Towards Optimal Training of Cascaded Detectors," Proc. European Conf. Computer Vision, A. Leonardis, H. Bischof, and A. Pinz, eds., pp. 325-337, 2006.
[47] C. Wojek and B. Schiele, "A Performance Evaluation of Single and Multi-Feature People Detection," Proc. DAGM Symp. Pattern Recognition, pp. 82-91, 2008.
[48] P. Sabzmeydani and G. Mori, "Detecting Pedestrians by Learning Shapelet Features," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2007.
[49] D.M. Gavrila and S. Munder, "Multi-Cue Pedestrian Detection and Tracking from a Moving Vehicle," Int'l J. Computer Vision, vol. 73, no. 1, pp. 41-59, 2007.
[50] J. Tao and J.-M. Odobez, "Fast Human Detection from Videos Using Covariance Features," Proc. Workshop Visual Surveillance, 2008.
[51] A. Ess, B. Leibe, K. Schindler, and L. van Gool, "Robust Multiperson Tracking from a Mobile Platform," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 10, pp. 1831-1846, Oct. 2009.
[52] D. Hoiem, A.A. Efros, and M. Hebert, "Putting Objects in Perspective," Int'l J. Computer Vision, vol. 80, no. 1, pp. 3-15, 2008.
[53] G. Gualdi, A. Prati, and R. Cucchiara, "Covariance Descriptors on Moving Regions for Human Detection in Very Complex Outdoor Scenes," Proc. ACM/IEEE Int'l Conf. Distributed Smart Cameras, Aug. 2009.
[54] C. Wojek, G. Dorkó, A. Schulz, and B. Schiele, "Sliding-Windows for Rapid Object Class Localization: A Parallel Technique," Proc. DAGM Symp. Pattern Recognition, 2008.
[55] C.H. Lampert, M.B. Blaschko, and T. Hofmann, "Efficient Subwindow Search: A Branch and Bound Framework for Object Localization," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 12, pp. 2129-2142, Dec. 2009.
[56] B. Han, D. Comaniciu, Y. Zhu, and L.S. Davis, "Sequential Kernel Density Approximation and Its Application to Real-Time Visual Tracking," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 7, pp. 1186-1197, July 2008.
[57] O. Lanz, "An Information Theoretic Rule for Sample Size Adaptation in Particle Filtering," Proc. 14th Int'l Conf. Image Analysis and Processing, pp. 317-322, Sept. 2007.
[58] N. Dalal, B. Triggs, and C. Schmid, "Human Detection Using Oriented Histograms of Flow and Appearance," Proc. European Conf. Computer Vision, pp. 428-441, 2006.
[59] B. Han, D. Comaniciu, Y. Zhu, and L. Davis, "Incremental Density Approximation and Kernel-Based Bayesian Filtering for Object Tracking," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2004.
[60] V. Philomin, R. Duraiswami, and L. Davis, "Quasi-Random Sampling for Condensation," Proc. European Conf. Computer Vision, pp. 134-49, 2000.
[61] O. Tuzel, F. Porikli, and P. Meer, "Region Covariance: A Fast Descriptor for Detection and Classification," Proc. Ninth European Conf. Computer Vision, pp. 589-600, 2006.
[62] A. Mohan, C. Papageorgiou, and T. Poggio, "Example-Based Object Detection in Images by Components," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 23, no. 4, pp. 349-361, Apr. 2001.
[63] A. Opelt, A. Pinz, M. Fussenegger, and P. Auer, "Generic Object Recognition with Boosting," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 3, pp. 416-431, Mar. 2006.
[64] J. Ponce, T. Berg, M. Everingham, D. Forsyth, M. Hebert, S. Lazebnik, M. Marszalek, C. Schmid, C. Russell, A. Torralba, C. Williams, J. Zhang, and A. Zisserman, "Data Set Issues in Object Recognition," Toward Category-Level Object Recognition, pp. 29-48. Springer, 2006.
[65] A. Martin, G. Doddington, T. Kamm, M. Ordowski, and M. Przybocki, "The DET Curve in Assessment of Detection Task Performance," Proc. European Conf. Speech Comm. and Technology, pp. 1895-1898, 1997.
[66] M. Everingham, L. Van Gool, C.K.I. Williams, J. Winn, and A. Zisserman, "The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results," Proc. Workshop European Conf. Computer Vision, http://www.pascal-network.org/challenges/ VOC/voc2007/workshopindex.html, 2006.

Index Terms:
search problems,Bayes methods,feature extraction,Gaussian processes,grid computing,image classification,image sampling,Monte Carlo methods,object detection,sliding window detection,multistage particle windows,accurate object detection,fast object detection,sliding window search,grid-distributed patches,binary classifier,statistical-based search,Monte Carlo sampling,likelihood density function,Gaussian kernels,multistage strategy,Bayesian-recursive framework,temporal coherency,face detection,pedestrian detection,Face,Feature extraction,Accuracy,Object detection,Support vector machines,Search problems,Face detection,coarse-to-fine search refinement.,Efficient object detection,pedestrian detection
Citation:
A. Prati, G. Gualdi, R. Cucchiara, "Multistage Particle Windows for Fast and Accurate Object Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 8, pp. 1589-1604, Aug. 2012, doi:10.1109/TPAMI.2011.247
Usage of this product signifies your acceptance of the Terms of Use.