ABSTRACT
<p><it>Abstract</it>—A probabilistic interpretation is presented for two important issues in neural network based classification, namely the interpretation of discriminative training criteria and the neural network outputs as well as the interpretation of the structure of the neural network. The problem of finding a suitable structure of the neural network can be linked to a number of well established techniques in statistical pattern recognition, such as the method of potential functions, kernel densities, and continuous mixture densities. Discriminative training of neural network outputs amounts to approximating the class or posterior probabilities of the classical statistical approach. This paper extends these links by introducing and analyzing novel criteria such as maximizing the class probability and minimizing the smoothed error rate. These criteria are defined in the framework of class-conditional probability density functions. We will show that these criteria can be interpreted in terms of weighted maximum likelihood estimation, where the weights depend in a complicated nonlinear fashion on the model parameters to be trained. In particular, this approach covers widely used techniques such as corrective training, learning vector quantization, and linear discriminant analysis.</p>
INDEX TERMS
Statistical pattern recognition, neural networks, discriminant functions, training criteria, speech recognition.
CITATION

H. Ney, "On the Probabilistic Interpretation of Neural Network Classifiers and Discriminative Training Criteria," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 17, no. , pp. 107-119, 1995.
doi:10.1109/34.368176