The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.08 - Aug. (2013 vol.35)
pp: 1972-1984
D. Tosato , Dipt. di Inf., Univ. of Verona, Verona, Italy
M. Spera , Dipt. di Inf., Univ. of Verona, Verona, Italy
M. Cristani , Pattern Anal. & Comput. Vision (PAVIS) Dept., Ist. Italiano di Tecnol., Genoa, Italy
V. Murino , Pattern Anal. & Comput. Vision (PAVIS) Dept., Ist. Italiano di Tecnol., Genoa, Italy
ABSTRACT
In surveillance applications, head and body orientation of people is of primary importance for assessing many behavioral traits. Unfortunately, in this context people are often encoded by a few, noisy pixels so that their characterization is difficult. We face this issue, proposing a computational framework which is based on an expressive descriptor, the covariance of features. Covariances have been employed for pedestrian detection purposes, actually a binary classification problem on Riemannian manifolds. In this paper, we show how to extend to the multiclassification case, presenting a novel descriptor, named weighted array of covariances, especially suited for dealing with tiny image representations. The extension requires a novel differential geometry approach in which covariances are projected on a unique tangent space where standard machine learning techniques can be applied. In particular, we adopt the Campbell-Baker-Hausdorff expansion as a means to approximate on the tangent space the genuine (geodesic) distances on the manifold in a very efficient way. We test our methodology on multiple benchmark datasets, and also propose new testing sets, getting convincing results in all the cases.
INDEX TERMS
Manifolds, Symmetric matrices, Head, Magnetic heads, Covariance matrix, Humans, Estimation,Riemannian manifolds, Pedestrian characterization, covariance descriptors
CITATION
D. Tosato, M. Spera, M. Cristani, V. Murino, "Characterizing Humans on Riemannian Manifolds", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.35, no. 8, pp. 1972-1984, Aug. 2013, doi:10.1109/TPAMI.2012.263
REFERENCES
[1] A. Vinciarelli, M. Pantic, and H. Bourlard, "Social Signal Processing: Survey of an Emerging Domain," Image and Vision Computing J., vol. 27, no. 12, pp. 1743-1759, 2009.
[2] K. Smith, S. Ba, J. Odobez, and D. Gatica-Perez, "Tracking the Visual Focus of Attention for a Varying Number of Wandering People," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 7, pp. 1-18, July 2008.
[3] N. Robertson and I. Reid, "Automatic Reasoning about Causal Events in Surveillance Video," EURASIP J. Image and Video Processing, vol. 2011, 2011.
[4] M. Cristani, L. Bazzani, G. Paggetti, A. Fossati, A.D. Bue, D. Tosato, G. Menegaz, and V. Murino, "Social Interaction Discovery by Statistical Analysis of F-Formations," Proc. British Machine Vision Conf., 2011.
[5] M. Cristani, A. Pesarin, A. Vinciarelli, M. Crocco, and V. Murino, "Look at Who's Talking: Voice Activity Detection by Automated Gesture Analysis," Proc. Workshop Interactive Human Behavior Analysis in Open or Public Spaces, 2011.
[6] O. Tuzel, F. Porikli, and P. Meer, "Pedestrian Detection via Classification on Riemannian Manifolds," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 10, pp. 1713-1727, Oct. 2008.
[7] O. Tuzel, F. Porikli, and P. Meer, "Region Covariance: A Fast Descriptor for Detection and Classification," Proc. European Conf. Computer Vision, pp. 589-600, 2006.
[8] M. Donoser and H. Bischof, "Using Covariance Matrices for Unsupervised Texture Segmentation," Proc. Int'l Conf. Pattern Recognition, pp. 1-4, 2008.
[9] J. Yao and J. Odobez, "Fast Human Detection from Videos Using Covariance Features," Proc. Eighth Int'l Workshop Visual Surveillance, 2008.
[10] B. Wu and R. Nevatia, "Optimizing Discrimination-Efficiency Tradeoff in Integrating Heterogeneous Local Features for Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[11] H. Karcher, "Riemannian Center of Mass and Mollifier Smoothing," Comm. Pure and Applied Math., vol. 30, pp. 509-541, 1997.
[12] J. Duistermaat and J. Kolk, Lie Groups. Springer Verlag, 2000.
[13] D. Tosato, M. Farenzena, M. Cristani, M. Spera, and V. Murino, "Multi-Class Classification on Riemannian Manifolds for Video Surveillance," Proc. European Conf. Computer Vision, pp. 378-391, 2010.
[14] N. Dalal and B. Triggs, "Histograms of Oriented Gradients for Human Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, p. 886, 2005.
[15] A. Vedaldi and A. Zisserman, "Efficient Additive Kernels via Explicit Feature Maps," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 3539-3546, 2010.
[16] J. Gall and V. Lempitsky, "Class-Specific Hough Forests for Object Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2009.
[17] O. Tuzel, F. Porikli, and P. Meer, "Region Covariance: A Fast Descriptor for Detection and Classification," Proc. European Conf. Computer Vision, pp. 589-600, 2006.
[18] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan, "Object Detection with Discriminatively Trained Part Based Models," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1627-1645, Sept. 2010.
[19] B. Wu and R. Nevatia, "Detection and Segmentation of Multiple, Partially Occluded Objects by Grouping, Merging, Assigning Part Detection Responses," Int'l J. Computer Vision, vol. 82, no. 2, pp. 185-204, 2009.
[20] P. Dollár, Z. Tu, P. Perona, and S. Belongie, "Integral Channel Features," Proc. British Machine Vision Conf., 2009.
[21] M. Enzweiler and D.M. Gavrila, "Monocular Pedestrian Detection: Survey and Experiments," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 12, pp. 2179-2195, Dec. 2009.
[22] J. Odobez and S. Ba, "A Cognitive and Unsupervised Map Adaptation Approach to the Recognition of the Focus of Attention from Head Pose," Proc. IEEE Int'l Conf. Multimedia Expo, pp. 1379-1382, 2007.
[23] M. Andriluka, S. Roth, and B. Schiele, "Pictorial Structures Revisited: People Detection and Articulated Pose Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 1014-1021, 2009.
[24] J. Orozco, S. Gong, and T. Xiang, "Head Pose Classification in Crowded Scenes," Proc. British Machine Vision Conf., 2009.
[25] P. Sabzmeydani and G. Mori, "Detecting Pedestrians by Learning Shapelet Features," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2007.
[26] Y. Mu, S. Yan, Y. Liu, T. Huang, and B. Zhou, "Discriminative Local Binary Patterns for Human Detection in Personal Album," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008.
[27] C. Wojek and B. Schiele, "A Performance Evaluation of Single and Multi-Feature People Detection," Proc. 30th DAGM Symp. Pattern Recognition, vol. I, pp. 82-91, 2008.
[28] E. Murphy-Chutorian and M.M. Trivedi, "Head Pose Estimation in Computer Vision: A Survey," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 607-626, Apr. 2009.
[29] X. Wang, T. Han, and S. Yan, "An HOG-LBP Human Detector with Partial Occlusion Handling," Proc. IEEE Int'l Conf. Computer Vision, pp. 32-39, 2010.
[30] S. Walk, N. Majer, K. Schindler, and B. Schiele, "New Features and Insights for Pedestrian Detection," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1030-1037, 2010.
[31] A. Bar-Hillel, D. Levi, E. Krupka, and C. Goldberg, "Part-Based Feature Synthesis for Human Detection," Proc. European Conf. Computer Vision, vol. I, pp. 127-142, 2010.
[32] Z. Lin and L. Davis, "A Pose-Invariant Descriptor for Human Detection and Segmentation," Proc. European Conf. Computer Vision, pp. 423-436, 2008.
[33] E. Murphy-Chutorian and M. Trivedi, "Head Pose Estimation in Computer Vision: A Survey," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 607-626, Apr. 2009.
[34] S. Ba and J. Odobez, "Evaluation of Multiple Cue Head Pose Estimation Algorithms in Natural Environments," Proc. IEEE Int'l Conf. Multimedia Expo, pp. 1330-1333, 2005.
[35] G. Fanelli, J. Gall, and L. Van Gool, "Real Time Head Pose Estimation with Random Regression Forests," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[36] D. Huang, M. Storer, F. De la Torre, and H. Bischof, "Supervised Local Subspace Learning for Continuous Head Pose Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2011.
[37] N. Robertson and I. Reid, "Estimating Gaze Direction from Low-Resolution Faces in Video," Proc. European Conf. Computer Vision, pp. 402-415, 2006.
[38] A. Agarwal and B. Triggs, "A Local Basis Representation for Estimating Human Pose from Cluttered Images," Proc. Seventh Asian Conf. Computer Vision, pp. 50-59, 2006.
[39] L. Bourdev and J. Malik, "Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1365-1372, 2009.
[40] D. Tran and D. Forsyth, "Improved Human Parsing with a Full Relational Model," Proc. European Conf. Computer Vision, pp. 227-240, 2010.
[41] W. Schwartz, A. Kembhavi, D. Harwood, and L. Davis, "Human Detection Using Partial Least Squares Analysis," Proc. IEEE Int'l Conf. Computer Vision, 2009.
[42] M. Enzweiler and D. Gavrila, "Integrated Pedestrian Classification and Orientation Estimation," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 982-989, 2010.
[43] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 2169-2178, 2006.
[44] A. Cherian, S. Sra, A. Banerjee, and N. Papanikolopoulos, "Efficient Similarity Search for Covariance Matrices via the Jensen-Bregman Logdet Divergence," Proc. IEEE Int'l Conf. Computer Vision, pp. 2399-2406, 2011.
[45] P. Fletcher, C. Lu, S. Pizer, and S. Joshi, "Principal Geodesic Analysis for the Study of Nonlinear Statistics of Shape," IEEE Trans. Medical Imaging, vol. 23, no. 8, pp. 995-1005, Aug. 2004.
[46] P. Fillard, X. Pennec, V. Arsigny, and N. Ayache, "Clinical DT-MRI Estimation, Smoothing, and Fiber Tracking with Log-Euclidean Metrics," IEEE Trans. Medical Imaging, vol. 26, no. 11, pp. 1472-1482, Nov. 2007.
[47] V. Arsigny, P. Fillard, X. Pennec, and N. Ayache, "Geometric Means in a Novel Vector Space Structure on Symmetric Positive-Definite Matrices," SIAM J. Matrix Analysis and Applications, vol. 29, no. 1, pp. 328-347, 2008.
[48] P. Fletcher, S. Venkatasubramanian, and S. Joshi, "Robust Statistics on Riemannian Manifolds via the Geometric Median," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1-8, 2008.
[49] S. Sommer, F. Lauze, S. Hauberg, and M. Nielsen, "Manifold Valued Statistics, Exact Principal Geodesic Analysis and the Effect of Linear Approximations," Proc. European Conf. Computer Vision, pp. 43-56, 2010.
[50] A. Cherian, S. Sra, A. Banerjee, and N. Papanikolopoulos, "Efficient Similarity Search for Covariance Matrices via the Jensen-Bregman Logdet Divergence," Proc. IEEE Int'l Conf. Computer Vision, pp. 2399-2406, 2011.
[51] I. Chavel, Riemannian Geometry: A Modern Introduction. Cambridge Univ. Press, 2006.
[52] E. Sernesi, Linear Algebra: A Geometric Approach. Chapman & Hall/CRC, 1993.
[53] L. Saul and M. Jordan, "Mixed Memory Markov Models: Decomposing Complex Stochastic Processes as Mixtures of Simpler Ones," Machine Learning, vol. 37, no. 1, pp. 75-87, 1999.
[54] J. Zhang, S. Lazebnik, and C. Schmid, "Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study," Int'l J. Computer Vision, vol. 73, pp. 213-238, 2007.
[55] P. Dollár, "Piotr's Image and Video Matlab Toolbox (PMT)," http://vision.ucsd.edu/pdollar/toolbox/doc index.html, 2013.
[56] D. Tosato, "ARCO (Arrray of Covariance Matrices), Code and Data Sets," http://sites.google.com/site/diegotosato ARCO, 2013.
[57] N. Gourier, "Head Pose Image Database (Pointing '04 ICPR Workshop)," http://www-prima.imag.fr/Pointing04data-face. html , 2013.
[58] R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker, "The CMU Multi-Pose, Illumination, and Expression (Multi-Pie) Face Database," Technical Report TR-07-08, Robotics Inst., Carnegie Mellon Univ., 2007.
[59] J.M. Odobez, "IDIAP Head Pose Database," http://www. idiap.ch/datasetheadpose, 2013.
[60] M. Voit, K. Nickel, and R. Stiefelhagen, "Head Pose Estimation in Single- and Multi-View Environments—Results on the CLEAR '07 Benchmarks," Multimodal Technologies for Perception of Humans, pp. 307-316, chap. 29, Springer, 2008.
[61] R. Fisher, "CAVIAR Case Scenarios," http://groups.inf.ed.ac.uk/vision/CAVIAR CAVIARDATA1/, 2013.
[62] D. Gray, S. Brennan, and H. Tao, "Evaluating Appearance Models for Recognition Reacquisition and Tracking," Proc. IEEE Int'l Workshop Performance Evaluation of Tracking and Surveillance, 2007.
[63] W.R. Schwartz, "ETHZ Data Set for Appearance-Based Modeling," http://www.liv.ic.unicamp.br/wschwartzdatasets.html , 2013.
[64] C. Chang and C. Lin, "LIBSVM: A Library for Support Vector Machines," http://www.csie.ntu.edu.tw/cjlinlibsvm/, 2013.
[65] S. Ba and J.-M. Odobez, "From Camera Head Pose to 3D Global Room Head Pose Using Multiple Camera Views," Proc. Int'l Workshop Classification of Events Activities and Relationships, 2007.
[66] E. Ricci and J.-M. Odobez, "Learning Large Margin Likelihoods for Realtime Head Pose Tracking," Proc. 16th IEEE Int'l Conf. Image Processing, pp. 2565-2568, http://dl.acm.orgcitation. cfm?id=1819298.1819450 , 2009.
36 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool