Subscribe
Issue No.10 - October (2009 vol.31)
pp: 1847-1861
Minyoung Kim , Rutgers University, Piscataway
Vladimir Pavlovic , Rutgers University, Piscataway
ABSTRACT
We consider the problem of predicting a sequence of real-valued multivariate states that are correlated by some unknown dynamics, from a given measurement sequence. Although dynamic systems such as the State-Space Models are popular probabilistic models for the problem, their joint modeling of states and observations, as well as the traditional generative learning by maximizing a joint likelihood may not be optimal for the ultimate prediction goal. In this paper, we suggest two novel discriminative approaches to the dynamic state prediction: 1) learning generative state-space models with discriminative objectives and 2) developing an undirected conditional model. These approaches are motivated by the success of recent discriminative approaches to the structured output classification in discrete-state domains, namely, discriminative training of Hidden Markov Models and Conditional Random Fields (CRFs). Extending CRFs to real multivariate state domains generally entails imposing density integrability constraints on the CRF parameter space, which can make the parameter learning difficult. We introduce an efficient convex learning algorithm to handle this task. Experiments on several problem domains, including human motion and robot-arm state estimation, indicate that the proposed approaches yield high prediction accuracy comparable to or better than state-of-the-art methods.
INDEX TERMS
Discriminative models and learning, dynamic state prediction, state-space models, conditional random fields.
CITATION
Minyoung Kim, Vladimir Pavlovic, "Discriminative Learning for Dynamic State Prediction", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol.31, no. 10, pp. 1847-1861, October 2009, doi:10.1109/TPAMI.2009.37
REFERENCES
 [1] P. Abbeel, A. Coates, M. Montemerlo, A.Y. Ng, and S. Thrun, “Discriminative Training of Kalman Filters,” Proc. Robotics: Science and Systems, 2005. [2] Y. Bar-Shalom and X.-R. Li, Estimation and Tracking: Principles, Techniques, and Software. Artech House, 1993. [3] C.H. Ek, P.H. Torr, and N.D. Lawrence, “Gaussian Process Latent Variable Models for Human Pose Estimation,” Proc. Joint Workshop Machine Learning and Multimodal Interaction, 2007. [4] J.C. Engwerda, “On the Existence of the Positive Definite Solution of the Matrix Equation ${X}+{A}^{T}{X}^{-1}{A}={I}$ ,” Linear Algebra and Its Application, vol. 194, pp. 91-108, 1993. [5] J.C. Engwerda, A.C.M. Ran, and A.L. Rijkeboer, “Necessary and Sufficient Conditions for the Existence of a Positive Definite Solution of the Matrix Equation ${X}+{A}^{\ast}{X}^{-1}{A}={Q}$ ,” Linear Algebra and Its Application, vol. 186, pp. 255-275, 1993. [6] Z. Ghahramani and S. Roweis, “Learning Nonlinear Dynamical Systems Using an EM Algorithm,” Advances in Neural Information Processing Systems, 1999. [7] R. Greiner and W. Zhou, “Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers,” Proc. 18th Ann. Nat'l Conf. Artificial Intelligence, 2002. [8] M. Hollander and D.A. Wolfe, Nonparametric Statistical Methods. Wiley, 1973. [9] A.D. Jepson, D.J. Fleet, and T.F. El-Maraghi, “Robust Online Appearance Models for Visual Tracking,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1296-1311, Oct. 2001. [10] Y. Jing, V. Pavlovic, and J.M. Rehg, “Efficient Discriminative Learning of Bayesian Network Classifier via Boosted Augmented Naive Bayes,” Proc. Int'l Conf. Machine Learning, 2005. [11] M. Jordan, “Graphical Models,” Statistical Science, special issue on Bayesian statistics, vol. 19, pp. 140-155, 2004. [12] M. Jordan and R. Jacobs, “Hierarchical Mixtures of Experts and the EM Algorithm,” Neural Computation, vol. 6, pp. 181-214, 1994. [13] S. Kakade, Y. Teh, and S. Roweis, “An Alternate Objective Function for Markovian Fields,” Proc. Int'l Conf. Machine Learning, 2002. [14] M. Kim and V. Pavlovic, “Discriminative Learning of Mixture of Bayesian Network Classifiers for Sequence Classification,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2006. [15] J. Lafferty, A. McCallum, and F. Pereira, “Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data,” Proc. Int'l Conf. Machine Learning, 2001. [16] J. Lafferty, X. Zhu, and Y. Liu, “Kernel Conditional Random Fields: Representation and Clique Selection,” Proc. Int'l Conf. Machine Learning, 2004. [17] D.J.C. MacKay, “A Practical Bayesian Framework for Backpropagation Networks,” Neural Computation, vol. 4, pp. 448-472, 1992. [18] A. Mccallum, D. Freitag, and F. Pereira, “Maximum Entropy Markov Models for Information Extraction and Segmentation,” Proc. Int'l Conf. Machine Learning, 2000. [19] A.Y. Ng and M. Jordan, “On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes,” Neural Information Processing Systems, 2002. [20] F. Pernkopf and J. Bilmes, “Discriminative versus Generative Parameter and Structure Learning of Bayesian Network Classifiers,” Proc. Int'l Conf. Machine Learning, 2005. [21] D. Ross, S. Osindero, and R. Zemel, “Combining Discriminative Features to Infer Complex Trajectories,” Proc. Int'l Conf. Machine Learning, 2006. [22] F. Sha and F. Pereira, “Shallow Parsing with Conditional Random Fields,” Proc. Human Language Technology Conf.—North American Chapter of the Assoc. Computational Linguistics (NAACL), 2003. [23] C. Sminchisescu, A. Kanaujia, Z. Li, and D. Metaxas, “Discriminative Density Propagation for 3D Human Motion Estimation,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2005. [24] M.F. Tappen, C. Liu, E.H. Adelson, and W.T. Freeman, “Learning Gaussian Conditional Random Fields for Low-Level Vision,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2007. [25] L. Taycher, D. Demirdjian, T. Darrell, and G. Shakhnarovich, “Conditional Random People: Tracking Humans with CRFs and Grid Filters,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2006. [26] T.-P. Tian, R. Li, and S. Sclaroff, “Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions,” Proc. IEEE Workshop Computer Vision and Pattern Recognition, 2005. [27] R. Urtasun, D. Fleet, A. Hertzmann, and P. Fua, “Priors for People Tracking from Small Training Sets,” Proc. IEEE Int'l Conf. Computer Vision, 2005. [28] R. Urtasun, D.J. Fleet, and P. Fua, “Gaussian Process Dynamical Models for 3d People Tracking,” Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, 2006. [29] R. van der Merwe and E. Wan, “The Square-Root Unscented Kalman Filter for State and Parameter-Estimation,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, 2001. [30] P. Woodland and D. Povey, “Large Scale Discriminative Training of Hidden Markov Models for Speech Recognition,” Computer Speech and Language, vol. 16, no. 1, pp. 25-47, 2002. [31] E.P. Xing, A.Y. Ng, M.I. Jordan, and S. Russell, “Distance Metric Learning, with Application to Clustering with Side Information,” Advances in Neural Information Processing Systems, 2002.