
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
M. Wohlmayr, F. Pernkopf, S. Tschiatschek, "Maximum Margin Bayesian Network Classifiers," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 3, pp. 521532, March, 2012.  
BibTex  x  
@article{ 10.1109/TPAMI.2011.149, author = {M. Wohlmayr and F. Pernkopf and S. Tschiatschek}, title = {Maximum Margin Bayesian Network Classifiers}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {34}, number = {3}, issn = {01628828}, year = {2012}, pages = {521532}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2011.149}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  Maximum Margin Bayesian Network Classifiers IS  3 SN  01628828 SP521 EP532 EPD  521532 A1  M. Wohlmayr, A1  F. Pernkopf, A1  S. Tschiatschek, PY  2012 KW  pattern classification KW  belief networks KW  conjugate gradient methods KW  convex programming KW  feature extraction KW  learning (artificial intelligence) KW  maximum likelihood estimation KW  marginoptimized Bayesian network classifiers KW  maximum margin Bayesian network classifiers KW  maximum margin parameter learning algorithm KW  conjugate gradient method KW  normalization constraints KW  probabilistic interpretation KW  missing feature handling KW  conditional likelihood learning KW  maximum likelihood learning KW  discriminative parameter learning KW  maximum margin optimization approach KW  convex relaxation KW  CGbased optimization KW  Bayesian methods KW  Optimization KW  Niobium KW  Fasteners KW  Random variables KW  Training KW  Algorithm design and analysis KW  convex relaxation. KW  Bayesian network classifier KW  discriminative learning KW  discriminative classifiers KW  large margin training KW  missing features VL  34 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
[1] Y. Guo, D. Wilkinson, and D. Schuurmans, “Maximum Margin Bayesian Networks,” Proc. Int'l Conf. Uncertainty in Artificial Intelligence, pp. 233242, 2005.
[2] V. Vapnik, Statistical Learning Theory. Wiley & Sons, 1998.
[3] B. Taskar, C. Guestrin, and D. Koller, “MaxMargin Markov Networks,” Proc. Advances in Neural Information Processing Systems, 2003.
[4] H. Wettig, P. Grünwald, T. Roos, P. Myllymäki, and H. Tirri, “When Discriminative Learning of Bayesian Network Parameters Is Easy,” Proc. Int'l Joint Conf. Artificial Intelligence, pp. 491496, 2003.
[5] T. Roos, H. Wettig, P. Grünwald, P. Myllymäki, and H. Tirri, “On Discriminative Bayesian Network Classifiers and Logistic Regression,” Machine Learning, vol. 59, pp. 267296, 2005.
[6] F. Sha and L. Saul, “Comparison of Large Margin Training to Other Discriminative Methods for Phonetic Recognition by Hidden Markov Models,” Proc. IEEE Int'l Conf. Acoustics, Speech, and Signal Processing, pp. 313316, 2007.
[7] G. Heigold, T. Deselaers, R. Schlüter, and H. Ney, “Modified MMI/MPE: A Direct Evaluation of the Margin in Speech Recognition,” Proc. Int'l Conf. Machine Learning, pp. 384391, 2008.
[8] R. Collobert, F. Siz, J. Weston, and L. Bottou, “Trading Convexity for Scalability,” Proc. Int'l Conf. Machine Learning, pp. 201208, 2006.
[9] C. Bishop, Neural Networks for Pattern Recognition. Oxford Univ. Press, 1995.
[10] R. Greiner, X. Su, S. Shen, and W. Zhou, “Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers,” Machine Learning, vol. 59, pp. 297322, 2005.
[11] O. Gopalakrishnan, D. Kanevsky, A. Nàdas, and D. Nahamoo, “An Inequality for Rational Functions with Applications to Some Statistical Estimation Problems,” IEEE Trans. Information Theory, vol. 37, no. 1, pp. 107113, Jan. 1991.
[12] F. Pernkopf and M. Wohlmayr, “On Discriminative Parameter Learning of Bayesian Network Classifiers,” Proc. European Conf. Machine Learning, pp. 221237, 2009.
[13] P. Woodland and D. Povey, “Large Scale Discriminative Training of Hidden Markov Models for Speech Recognition,” Computer Speech and Language, vol. 16, pp. 2547, 2002.
[14] R. Schlüter, W. Macherey, M.B., and H. Ney, “Comparison of Discriminative Training Criteria and Optimization Methods for Speech Recognition,” Speech Comm., vol. 34, pp. 287310, 2001.
[15] F. Pernkopf and M. Wohlmayr, “Maximum Margin Bayesian Network Classifiers,” technical report, Inst. Signal Processing and Speech Comm., Graz Univ. of Tech nology, 2010.
[16] F. Pernkopf and M. Wohlmayr, “Large Margin Learning of Bayesian Classifiers Based on Gaussian Mixture Models,” Proc. European Conf. Machine Learning, pp. 5066, 2010.
[17] L. Lamel, R. Kassel, and S. Seneff, “Speech Database Development: Design and Analysis of the AcousticPhonetic Corpus,” Proc. US Defense Advanced Research Projects Agency Speech Recognition Workshop, 1986.
[18] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “GradientBased Learning Applied to Document Recognition,” Proc. IEEE, vol. 86, no. 11, pp. 22782324, Nov. 1998.
[19] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988.
[20] F. Pernkopf and J. Bilmes, “Efficient Heuristics for Discriminative Structure Learning of Bayesian Network Classifiers,” J. Machine Learning Research, vol. 11, pp. 23232360, 2010.
[21] N. Friedman, D. Geiger, and M. Goldszmidt, “Bayesian Network Classifiers,” Machine Learning, vol. 29, pp. 131163, 1997.
[22] P. Domingos and M. Pazzani, “On the Optimality of the Simple Bayesian Classifier under ZeroOne Loss,” Machine Learning, vol. 29, nos. 2/3, pp. 103130, 1997.
[23] J. Bilmes, “Dynamic Bayesian Multinets,” Proc. 16th Int'l Conf. Uncertainty in Artificial Intelligence, pp. 3845, 2000.
[24] R. Cowell, A. Dawid, S. Lauritzen, and D. Spiegelhalter, Probabilistic Networks and Expert Systems. Springer Verlag, 1999.
[25] C. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
[26] S. Acid, L. de Campos, and J. Castellano, “Learning Bayesian Network Classifiers: Searching in a Space of Partially Directed Acyclic Graphs,” Machine Learning, vol. 59, pp. 213235, 2005.
[27] B. Schölkopf and A. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2001.
[28] P. Huber, “Robust Estimation of a Location Parameter,” Annals of Statistics, vol. 53, pp. 73101, 1964.
[29] O. Chapelle, “Training a Support Vector Machine in the Primal,” Neural Computation, vol. 19, no. 5, pp. 11551178, 2007.
[30] W. Press, S. Teukolsky, W. Vetterling, and B. Flannery, Numerical Recipes in C. Cambridge Univ. Press, 1992.
[31] T. Cover and J. Thomas, Elements of Information Theory. John Wiley & Sons, 1991.
[32] E. Keogh and M. Pazzani, “Learning Augmented Bayesian Classifiers: A Comparison of DistributionBased and ClassificationBased Approaches,” Proc. Workshop Artificial Intelligence and Statistics, pp. 225230, 1999.
[33] F. Pernkopf, “Bayesian Network Classifiers versus Selective $k$ NN Classifier,” Pattern Recognition, vol. 38, no. 3, pp. 110, 2005.
[34] D. Grossman and P. Domingos, “Learning Bayesian Network Classifiers by Maximizing Conditional Likelihood,” Proc. Int'l Conf. Machine Lerning, pp. 361368, 2004.
[35] P. Bartlett, M. Jordan, and J. McAuliffe, “Convexity, Classification, and Risk Bounds,” J. Am. Statistical Assoc., vol. 101, no. 473, pp. 138156, 2006.
[36] F. Pernkopf and M. Wohlmayr, “Stochastic MarginBased Structure Learning of Bayesian Network Classifiers,” technical report, Laboratory of Signal Processing and Speech Comm., Graz Univ. of Tech nology, 2011.
[37] F. Pernkopf and J. Bilmes, “OrderBased Discriminative Structure Learning for Bayesian Network Classifiers,” Proc. Int'l Symp. Artificial Intelligence and Math., 2008.
[38] U. Fayyad and K. Irani, “MultiInterval Discretizaton of ContinuousValued Attributes for Classification Learning,” Proc. Joint Conf. Artificial Intelligence, pp. 10221027, 1993.
[39] F. Pernkopf, T. Van Pham, and J. Bilmes, “Broad Phonetic Classification Using Discriminative Bayesian Networks,” Speech Comm., vol. 143, no. 1, pp. 123138, 2008.
[40] A. Wächter and L. Biegler, “On the Implementation of an InteriorPoint Filter LineSearch Algorithm for LargeScale Nonlinear Programming,” Math. Programming, vol. 106, pp. 2557, 2006.
[41] L. Biegler and V. Zavala, “LargeScale Nonlinear Programming Using IPOPT: An Integrating Framework for EnterpriseWide Dynamic Optimization,” Computers & Chemical Eng., vol. 33, no. 3, pp. 575582, 2009.
[42] P. Amestoy, I. Duff, J.Y. L'Excellent, and J. Koster, “MUMPS: A General Purpose Distributed Memory Sparse Solver,” Proc. Fifth Int'l Workshop Applied Parallel Computing, pp. 122131, 2000.
[43] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge Univ. Press, Mar. 2004.