2nd New Zealand Two-Stream International Conference on Artificial Neural Networks and Expert Systems (ANNES '95) Dunedin, New Zealand November 20-November 23 ISBN: 0-8186-7174-2
A neural network is trained by using a set of available examples to minimize the training error such that the network parameters fit the examples well. However, it is desired to minimize the generalization error to which no direct access is possible. There are discrepancies between the training error and the generalization error due to the statistical fluctuation of examples. The present talk focuses on this problem from the statistical point of view. When the number of training examples is large, we have a universal asymptotic evaluation on the discrepancies of the two errors. This can be used for model selection based on the information criterion. When the number of training examples is small, their discrepancies are big, causing a serious overfitting or overtraining problem. We analyze this phenomenon by using a simple model. It is surprising that the generalization error even increases as the number of examples increases in a certain range. This shows the inadequacy of the minimum training error learning method. We evaluate various means overcoming the overtraining such as cross-validated early stopping of training, introduction of the regularization terms, model selection and others.
Index Terms:
overtraining, training error, generalization error, learning curve
Citation:
Shun-ichi Amari, "Training Error, Generalization Error and Learning Curves in Neural Learning," annes, pp.4, 2nd New Zealand Two-Stream International Conference on Artificial Neural Networks and Expert Systems (ANNES '95), 1995 Usage of this product signifies your acceptance of the Terms of Use. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||