|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Richard Nock, Frank Nielsen, "Bregman Divergences and Surrogates for Learning," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 11, pp. 2048-2059, November, 2009. | |||
| BibTex | x | ||
| @article{ 10.1109/TPAMI.2008.225, author = {Richard Nock and Frank Nielsen}, title = {Bregman Divergences and Surrogates for Learning}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {31}, number = {11}, issn = {0162-8828}, year = {2009}, pages = {2048-2059}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPAMI.2008.225}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Pattern Analysis and Machine Intelligence TI - Bregman Divergences and Surrogates for Learning IS - 11 SN - 0162-8828 SP2048 EP2059 EPD - 2048-2059 A1 - Richard Nock, A1 - Frank Nielsen, PY - 2009 KW - Ensemble learning KW - boosting KW - Bregman divergences KW - linear separators KW - decision trees. VL - 31 JA - IEEE Transactions on Pattern Analysis and Machine Intelligence ER - | |||
[1] P. Bartlett , M. Jordan , and J.D. McAuliffe , “Convexity, Classification, and Risk Bounds,” J. Am. Statistical Assoc., vol. 101, pp. 138-156, 2006.
[2] P. Bartlett and M. Traskin , “Adaboost is Consistent,” Proc. Neural Information Processing Systems Conf., 2006.
[3] M.J. Kearns and Y. Mansour , “On the Boosting Ability of Top-Down Decision Tree Learning Algorithms,” J. Computer and System Sciences, vol. 58, pp. 109-128, 1999.
[4] R.E. Schapire and Y. Singer , “Improved Boosting Algorithms Using Confidence-Rated Predictions,” Proc. Conf. Computational Learning Theory, pp. 80-91, 1998.
[5] J. Friedman , T. Hastie , and R. Tibshirani , “Additive Logistic Regression: A Statistical View of Boosting,” Annals of Statistics, vol. 28, pp. 337-374, 2000.
[6] V. Vapnik , Statistical Learning Theory. John Wiley, 1998.
[7] N. Murata , T. Takenouchi , T. Kanamori , and S. Eguchi , “Information Geometry of ${\cal U}$ -Boost and Bregman Divergence,” Neural Computation, vol. 16, pp. 1437-1481, 2004.
[8] P. Grünwald and P. Dawid , “Game Theory, Maximum Entropy, Minimum Discrepancy and Robust Bayesian Decision Theory,” Annals of Statistics, vol. 32, pp. 1367-1433, 2004.
[9] M. Collins , R. Schapire , and Y. Singer , “Logistic Regression, Adaboost and Bregman Distances,” Proc. Conf. Computational Learning Theory, pp. 158-169, 2000.
[10] R.E. Schapire and Y. Singer , “Improved Boosting Algorithms Using Confidence-Rated Predictions,” Machine Learning, vol. 37, pp. 297-336, 1999.
[11] A. Azran and R. Meir , “Data Dependent Risk Bounds for Hierarchical Mixture of Experts Classifiers,” Proc. Conf. Computational Learning Theory, pp. 427-441, 2004.
[12] A. Banerjee , X. Guo , and H. Wang , “On the Optimality of Conditional Expectation As a Bregman Predictor,” IEEE Trans. Information Theory, vol. 51, pp. 2664-2669, 2005.
[13] C. Gentile and M. Warmuth , “Linear Hinge Loss and Average Margin,” Proc. 1998 Conf. Advances in Neural Information Processing Systems, pp. 225-231, 1998.
[14] D. Helmbold , J. Kivinen , and M. Warmuth , “Relative Loss Bounds for Single Neurons,” IEEE Trans. Neural Networks, vol. 10, no. 6, pp.1291-1304, Nov. 1999.
[15] A. Banerjee , S. Merugu , I. Dhillon , and J. Ghosh , “Clustering with Bregman Divergences,” J. Machine Learning Research, vol. 6, no. 6, pp.1705-1749, Nov. 2005.
[16] R. Nock and F. Nielsen , “A ${\hbox{\rlap{I}\kern 2.0pt{\hbox{R}}}}$ eal Generalization of Discrete AdaBoost,” Artificial Intelligence, vol. 171, pp. 25-41, 2007.
[17] R.E. Schapire , Y. Freund , P. Bartlett , and W.S. Lee , “Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods,” Annals of Statistics, vol. 26, pp. 1651-1686, 1998.
[18] L. Breiman , J.H. Freidman , R.A. Olshen , and C.J. Stone , Classification and Regression Trees. Wadsworth, 1984.
[19] J.R. Quinlan , C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
[20] K. Matsushita , “Decision Rule, Based on Distance, for the Classification Problem,” Annals of the Inst. of Statistical Math., vol. 8, pp. 67-77, 1956.
[21] M. Warmuth , J. Liao , and G. Rätsch , “Totally Corrective Boosting Algorithms that Maximize the Margin,” Proc. Int'l Conf. Machine Learning, pp. 1001-1008, 2006.
[22] J. Kivinen and M. Warmuth , “Boosting As Entropy Projection,” Proc. Conf. Computational Learning Theory, pp. 134-144, 1999.
[23] R. Nock and F. Nielsen , “On Domain-Partitioning Induction Criteria: Worst-Case Bounds for the Worst-Case Based,” Theoretical Computer Science, vol. 321, pp. 371-382, 2004.
[24] C. Henry , R. Nock , and F. Nielsen , “ ${\hbox{\rlap{I}\kern 2.0pt{\hbox{R}}}}$ eal Boosting a la Carte with an Application to Boosting Oblique Decision Trees,” Proc. 21st Int'l Joint Conf. Artificial Intelligence, pp. 842-847, 2007.
[25] C.L. Blake , E. Keogh , and C.J. Merz , “UCI Repository of Machine Learning Databases,” http://www.ics.uci.edu/~mlearnMLRepository.html , 1998.

