This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Shared Kernel Information Embedding for Discriminative Inference
April 2012 (vol. 34 no. 4)
pp. 778-790
L. Sigal, Disney Res., Pittsburgh, PA, USA
R. Memisevic, Dept. of Comput. Sci., Univ. of Frankfurt, Frankfurt, Germany
D. J. Fleet, Dept. of Comput. Sci., Univ. of Toronto, Toronto, ON, Canada
Latent variable models, such as the GPLVM and related methods, help mitigate overfitting when learning from small or moderately sized training sets. Nevertheless, existing methods suffer from several problems: 1) complexity, 2) the lack of explicit mappings to and from the latent space, 3) an inability to cope with multimodality, and 4) the lack of a well-defined density over the latent space. We propose an LVM called the Kernel Information Embedding (KIE) that defines a coherent joint density over the input and a learned latent space. Learning is quadratic, and it works well on small data sets. We also introduce a generalization, the shared KIE (sKIE), that allows us to model multiple input spaces (e.g., image features and poses) using a single, shared latent representation. KIE and sKIE permit missing data during inference and partially labeled data during learning. We show that with data sets too large to learn a coherent global model, one can use the sKIE to learn local online models. We use sKIE for human pose inference.

[1] B. Schölkopf, A. Smola, and K.-R. Müller, "Nonlinear Component Analysis as a Kernel Eigenvalue Problem," Neural Computation, vol. 10, pp. 1299-1319, July 1998.
[2] J.B. Tenenbaum, V. Silva, and J.C. Langford, "A Global Geometric Framework for Nonlinear Dimensionality Reduction," Science, vol. 290, no. 5500, pp. 2319-2323, Dec. 2000.
[3] S.T. Roweis and L.K. Saul, "Nonlinear Dimensionality Reduction by Locally Linear Embedding," Science, vol. 290, pp. 2323-2326, 2000.
[4] Y. Bengio., J.-F. Paiement, P. Vincent, O. Delalleau, N.L. Roux, and M. Ouimet, "Out-of-Sample Extensions for LLE, Isomap, MDS, Eigenmaps, and Spectral Clustering," Proc. Advances in Neural Information Processing Systems, pp. 177-184, 2004.
[5] M.E. Tipping and C.M. Bishop, "Probabilistic Principal Component Analysis," J. Royal Statistical Society, Series B, vol. 61, pp. 611-622, 1999.
[6] N.D. Lawrence, "Probabilistic Non-Linear Principal Component Analysis with Gaussian Process Latent Variable Models," J. Machine Learning Research, vol. 6, pp. 1783-1816, Nov. 2005.
[7] P. Meinicke, S. Klanke, R. Memisevic, and H. Ritter, "Principal Surfaces from Unsupervised Kernel Regression," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 9, pp. 1379-1391, Sept. 2005.
[8] R. Memisevic, "Kernel Information Embeddings," Proc. Int'l Conf. Machine Learing, pp. 633-640, 2006.
[9] L. Sigal, R. Memisevic, and D. Fleet, "Shared Kernel Information Embedding for Discriminative Inference," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, pp. 2852-2859, 2009.
[10] A. Agarwal and B. Triggs, "Learning to Track 3D Human Motion from Silhouettes," Proc. Int'l Conf. Machine Learning, pp. 9-16, 2004.
[11] A. Agarwal and B. Triggs, "3D Human Pose from Silhouettes by Relevance Vector Regression," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 882-888, 2004.
[12] C. Ek, P. Torr, and N. Lawrence, "Gaussian Process Latent Variable Models for Human Pose Estimation," Proc. Int'l Conf. Machine Learning for Multimodal Interaction, pp. 132-143, 2007.
[13] T. Jaeggli, E. Koller-Meier, and L.V. Gool, "Monocular Tracking with a Mixture of View-Dependent Learned Models," Proc. Conf. Articulated Motion and Deformable Objects, pp. 494-503, 2006.
[14] A. Kanaujia, C. Sminchisescu, and D. Metaxas, "Semi-Supervised Hierarchical Models for 3D Human Pose Reconstruction," Proc. IEEE Conf. Computer Vision Pattern Recognition, 2007.
[15] R. Navaratnam, A. Fitzgibbon, and R. Cipolla, "The Joint Manifold Model for Semi-Supervised Multi-Valued Regression," Proc. 11th IEEE Int'l Conf. Computer Vision, 2007.
[16] G. Shakhnarovich, P. Viola, and T. Darrell, "Fast Pose Estimation with Parameter-Sensitive Hashing," Proc. Ninth IEEE Int'l Conf. Computer Vision, vol. 2, pp. 750-759, 2003.
[17] C. Sminchisescu, A. Kanaujia, Z. Li, and D. Metaxas, "Discriminative Density Propagation for 3D Human Motion Estimation," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 390-397, 2005.
[18] Z. Lu, M. Carreira-Perpinan, and C. Sminchisescu, "People Tracking with the Laplacian Eigenmaps Latent Variable Model," Advances in Neural Information Processing Systems 20, J. Platt, D. Koller, Y. Singer, and S. Roweis, eds. MIT Press, pp. 1705-1712, 2008.
[19] R. Urtasun and T. Darrell, "Local Probabilistic Regression for Activity-Independent Human Pose Inference," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[20] L. Sigal, A. Balan, and M.J. Black, "Combined Discriminative and Generative Articulated Pose and Non-Rigid Shape Estimation," Proc. Neural Information Processing Systems, 2007.
[21] T. de Campos and D. Murray, "Regression-Based Hand Pose Estimation from Multiple Cameras," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 782-789, 2006.
[22] M.Á. Carreira-Perpiñán, "Reconstruction of Sequential Data with Probabilistic Models and Continuity Constraints," Proc. Neural Information Processing Systems, pp. 414-420, 1999.
[23] A. Thayananthan, R. Navaratnam, B. Stenger, P. Torr, and R. Cipolla, "Multivariate Relevance Vector Machines for Tracking," Proc. European Conf. Computer Vision, pp. 124-138, 2006.
[24] L. Bo and C. Sminchisescu, "Twin Gaussian Processes for Structured Prediction," Int'l J. Computer Vision, 2010.
[25] A. Shon, K. Grochow, A. Hertzmann, and R. Rao, "Learning Latent Structure for Image Synthesis and Robotic Imitation," Proc. Neural Information Processing Systems, pp. 1233-1240, 2006.
[26] A. Kanaujia, C. Sminchisescu, and D. Metaxas, "Spectral Latent Variable Models for Perceptual Inference," Proc. IEEE Int'l Conf. Computer Vision, 2007.
[27] J. Quiñonero-Candela and C. Rasmussen, "A Unifying View of Sparse Approximate Gaussian Process Regression," J. Machine Learning Research, vol. 6, pp. 1939-1959, 2006.
[28] M.A. Carreira-Perpiñan and Z. Lu, "The Laplacian Eigenmaps Latent Variable Model," J. Machine Learning Research W&P, vol. 2, pp. 59-66, 2007.
[29] R. Memisevic, "Non-Linear Latent Factor Models for Revealing Structure in High-Dimensional Data," PhD dissertation, Univ. of Toronto, 2008.
[30] T.M. Cover and J.A. Thomas, Elements of Information Theory. John Wiley & Sons, 1991.
[31] D.W. Scott, Multivariate Density Estimation: Theory, Practice, and Visualization (Wiley series in probability and statistics). Wiley, Sept. 1992.
[32] T. Brox, B. Rosenhahn, D. Cremers, and H.-P. Seidel, "Nonparametric Density Estimation with Adaptive Anisotropic Kernels for Human Motion Tracking," Proc. Second Int'l Workshop Human Motion, 2007.
[33] M. Kuss and T. Graepel, "The Geometry of Kernel Canonical Correlation Analysis," Technical Report 108, Max Planck Inst. for Biological Cybernetics, Tübingen, Germany, May 2003.
[34] M. Salzmann, C.H. Ek, R. Urtasun, and T. Darrell, "Factorized Orthogonal Latent Spaces," Proc. 13th Int'l Conf. Artificial Intelligence and Statistics, 2010.
[35] D. Comaniciu and P. Meer, "Mean Shift: A Robust Approach Toward Feature Space Analysis," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603-619, May 2002.
[36] D.J.C. MacKay, Information Theory, Inference, and Learning Algorithms. Cambridge Univ. Press, 2003.
[37] V. Raykar and R. Duraiswami, The Improved Fast Gauss Transform with Applications to Machine Learning. MIT Press, 2006.
[38] B. Han, D. Comaniciu, Y. Zhu, and L. Davis, "Sequential Kernel Density Approximation: Application to Real-Time Visual Tracking," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 7, pp. 1186-1197, July 2008.
[39] L. Sigal and M.J. Black, "HumanEva: Synchronized Video and Motion Capture Data Set for Evaluation of Articulated Human Motion," Technical Report CS-06-08, Brown Univ., 2006.
[40] L. Bo, C. Sminchisescu, A. Kanaujia, and D. Metaxas, "Fast Algorithms for Large Scale Conditional 3D Prediction," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[41] M. Ranzato, C. Poultney, S. Chopra, and Y. Lecun, "Efficient Learning of Sparse Representations with an Energy-Based Model," Proc. Advances in Neural Information Processing Systems, 2006.
[42] M.Á. Carreira-Perpiñán and Z. Lu, "Parametric Dimensionality Reduction by Unsupervised Regression," Proc. IEEE Conf. Computer Vision and Pattern Recognition, pp. 1895-1902, 2010.

Index Terms:
pose estimation,operating system kernels,human pose inference,shared kernel information embedding,discriminative inference,latent variable models,latent space,coherent joint density,Kernel,Manifolds,Training,Bandwidth,Data models,Estimation,Probabilistic logic,mutual information.,Latent variable models,kernel information embedding,inference,nonparametric
Citation:
L. Sigal, R. Memisevic, D. J. Fleet, "Shared Kernel Information Embedding for Discriminative Inference," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 4, pp. 778-790, April 2012, doi:10.1109/TPAMI.2011.154
Usage of this product signifies your acceptance of the Terms of Use.