
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Hanzhong Gu, Haruhisa Takahashi, "How Bad May Learning Curves Be?," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 11551167, October, 2000.  
BibTex  x  
@article{ 10.1109/34.879795, author = {Hanzhong Gu and Haruhisa Takahashi}, title = {How Bad May Learning Curves Be?}, journal ={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume = {22}, number = {10}, issn = {01628828}, year = {2000}, pages = {11551167}, doi = {http://doi.ieeecomputersociety.org/10.1109/34.879795}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Pattern Analysis and Machine Intelligence TI  How Bad May Learning Curves Be? IS  10 SN  01628828 SP1155 EP1167 EPD  11551167 A1  Hanzhong Gu, A1  Haruhisa Takahashi, PY  2000 KW  Generalization KW  concept learning KW  generalization error KW  learning curves KW  sample complexity KW  PAC learning KW  worstcase learning KW  interpolation dimension. VL  22 JA  IEEE Transactions on Pattern Analysis and Machine Intelligence ER   
Abstract—In this paper, we motivate the need for estimating bounds on learning curves of averagecase learning algorithms when they perform the worst on training samples. We then apply the method of reducing learning problems to hypothesis testing ones to investigate the learning curves of a socalled illdisposed learning algorithm in terms of a system complexity, the Boolean interpolation dimension. Since the illdisposed algorithm behaves worse than ordinal ones, and the Boolean interpolation dimension is generally bounded by the number of system weights, the results can apply to interpreting or to bounding the worstcase learning curve in real learning situations. This study leads to a new understanding of the worstcase generalization in real learning situations, which differs significantly from that in the uniform learnable setting via VapnikChervonenkis (VC) dimension analysis. We illustrate the results with some numerical simulations.
[1] V.N. Vapnik, Estimation of Dependences Based on Empirical Data. New York: SpringerVerlag, 1982.
[2] L.G. Valiant, “A Theory of the Learnable,” Comm. ACM, vol. 27, no. 11, pp. 11341142, Nov. 1984.
[3] A. Blumer, A. Ehrenfeucht, D. Haussler, and M. Warmuth, "Learnability and the VapnikChervonenkis Dimension," J. ACM, vol. 36, pp. 929965, 1989.
[4] D. Cohn and G. Tesauro, "How Tight are the VapnikChervonenkis Bounds?" Neural Computation, vol. 4, pp. 249269, 1992.
[5] D. Haussler, M. Kearns, M. Opper, and R. Schapire, “Bounds on the Sample Complexity of Bayesian Learning Using Information Theory and the VC Dimension,” Proc. Fourth Ann. Workshop Computer Learning Theory, pp. 6174, 1991.
[6] S. Amari, “A Universal Theorem on Learning Curves,” Neural Networks, vol. 6, pp. 161166, 1993.
[7] S. Amari, N. Fujita, and S. Shinomoto, "Four Types of Learning Curves," Neural Computation, vol. 4, pp. 605618, 1992.
[8] N. Tishby, E. Levin, and S. Solla, “Consistent Inference of Probabilities in Layered Networks: Predictions and Generalizations,” Proc. Int'l Joint Conf. Neural Networks, vol. 2, pp. 403409, 1989.
[9] D.B. Schwartz, V.K. Samalam, S.A. Solla, and J.S. Denker, “Exhaustive Learning,” Neural Computation, vol. 2, pp. 374385, 1990.
[10] D. Haussler, H.S. Seung, M. Kearns, and N. Tishby, “Rigorous Learning Curve from Statistical Mechanics,” preprints, 1994.
[11] S.B. Holden and M. Niranjan, “On the Practical Applicability of VC Dimension Bounds,” preprints, 1994.
[12] H. Gu and H. Takahashi, “Estimating Learning Curves of Concept Learning,” Neural Networks, vol. 10, no. 6, pp. 1,0891,102, 1997.
[13] H. Gu and H. Takahashi, “Towards More Practical Average Bounds on Supervised Learning,” IEEE Trans. Neural Networks, vol. 7, pp. 953968, 1996.
[14] H. Gu and H. Takahashi, “Learning Curves in Learning with Noise—An Empirical Study,” IEICE Trans. Information and Systems, vol. E80D, no. 1, pp. 7885, 1997.
[15] H. Gu and H. Takahashi, “Exponential or Polynomial Learning Curves?—CaseBased Studies,” Neural Computation, vol. 12, no. 4, pp. 795809, 2000.
[16] D. Haussler, “Quantifying Inductive Bias—AI Learning Algorithms and Valiant's Learning Framework,” Artificial Intelligence, vol. 36, pp. 177221, 1988.
[17] A. Macintyre and E. Sontag, “Finiteness Results for Sigmoidal 'Neural' Networks,” Proc. 25th Ann. Symp. Theory Computing, pp. 325334, May 1993.
[18] H. Takahashi and H. Gu, “A Tight Bound on Concept Learning,” IEEE Trans. Neural Networks, vol. 9, no. 6, pp. 1,1921,202, 1998.
[19] R.K. Montoye, E. Hokenek, and S.L. Runyon, "Design of the IBM RISC System/6000 FloatingPoint Execution Unit," IBM J. Research and Development, vol. 34, pp. 5970, Jan. 1990.
[20] E. Wang, “Neural Network Classification: A Bayesian Interpretation,” IEEE Trans. Neural Networks, vol. 1, no. 4, pp. 303305, 1990.
[21] H. Gu, “Towards a Practically Applicable Theory of Learning Curves in Neural Networks,” doctoral dissertation, The Univ. of ElectroComm., Tokyo 1997.