This Article 
 Bibliographic References 
 Add to: 
On the Discriminatory Power of Adaptive Feed-Forward Layered Networks
August 1994 (vol. 16 no. 8)
pp. 837-842

This correspondence expands the available theoretical framework that establishes a link between discriminant analysis and adaptive feed-forward layered linear-output networks used as mean-square classifiers. This has the advantages of providing more theoretical justification for the use of these nets in pattern classification and gaining a better insight into their behavior and about their use. The authors prove that, under reasonable assumptions, minimizing the mean-square error at the network output is equivalent to minimizing the following: 1) the difference between the optimum value of a familiar discriminant criterion and the value of this criterion evaluated in the space spanned 2) the outputs of the final hidden layer, and 3) the difference between the values of the same discriminant criterion evaluated in desired-output and actual-output subspaces. The authors also illustrate, under specific constraints, how to solve the following problem: given a feature extraction criterion, how the target coding scheme can be selected such that this criterion is maximized at the output of the network final hidden layer. Other properties for these networks are explored.

[1] W. Y. Huang and R. P. Lippmann, "Neural net and traditional classifiers," inNeural information Processing Systems, D. Anderson, Ed. New York : American Institute of Physics, 1988, pp. 387-396.
[2] R. P. Gorman and T. J. Sejnowski, "Learned classification of sonar targets using a massively parallel network,"IEEE Trans. Acoust., Speech, and Signal Processing, vol. 36, no. 7, pp. 1135-1140, 1988.
[3] R. Lippmann, "Pattern classification using neural networks,"IEEE Commun. Mag., vol. 27, no. 11, 1989.
[4] Y. Lee and R. P. Lippmann, "Practical characteristics of neural network and conventional pattern classifiers on artificial and speech problems," inProc. Neural Inform. Processing Systems-Natural and Synthetic Conf., Denver, CO, Nov. 1989, pp. 168-177.
[5] P. Gallinari, S. Thiria, and F. Fogelman Soulie, "Multilayer perceptrons and data analysis," inProc. 1988 ICNN Conf., San Diego, CA, 1988, pp. 391-398.
[6] A. R. Webb and D. Lowe, "The optimized internal representation of multilayer classifier networks performs nonlinear discriminant analysis,"Neural Networks, vol. 3, no. 4, pp. 367-375, 1990.
[7] D. Lowe and A. R. Webb, "Optimized feature extraction and the Bayes decision in feed-forward classifier networks,"IEEE Trans. Pattern Anal. Machine Intell., vol. 13, no. 4, pp. 355-364, 1991.
[8] H. Asoh and N. Otsu, "An approximation of nonlinear discriminant analysis by multilayer neural networks, inProc. Int. Joint Conf. Neural Networks, San Diego, CA, 1990, pp. III-211-III-216.
[9] P. Gallinari, S. Thiria, F. Badrin, and F. Fogelman-Soule, "On the relations between discriminant analysis and multilayer perceptrons,"Neural Networks, vol. 4, no. 3, pp. 349-360, 1991.
[10] R. O. Duda and P. E. Hart,Pattern Classification and Scene Analysis. New York: Wiley, 1973.
[11] W.G. Wee, "Generalized inverse approach to adaptive multiclass pattern classification,"IEEE Trans. Comput., vol. C- 17, no. 12, pp. 1157-1164, 1968.
[12] K. Fukunaga,Introduction to Statistical Pattern Recognition. New York: Academic, 1972.
[13] K. Fukunaga and R. D. Short, "Nonlinear feature extraction with a general criterion function,"IEEE Trans. Inform. Theory, vol. IT-24, no. 5, pp. 600-607, 1978.
[14] P. A. Devijver, "Relationships between statistical risks and the least-mean-square error design criterion in pattern recognition," inProc. First Int. Joint Conf. Pattern Recognit., Washington, DC, Nov. 1973, pp. 139-148.
[15] K. Fukunaga and R. D. Short, "A class of feature extraction criteria and its relation to the Bayes risk estimate,"IEEE Trans. Inform. Theory, vol. IT-26, no. 1, pp. 59-65, 1980.
[16] K. Fukunaga and S. Ando, "The optimum nonlinear features for a scatter criterion in discriminant analysis,"IEEE Trans. Inform. Theory, vol. IT-23, no. 4, pp. 453-459, 1977.
[17] S. R. Searle,Matrix Algebra Useful for Statistics. New York: John Wiley, Inc.

Index Terms:
pattern recognition; feedforward neural nets; feature extraction; Bayes methods; discriminatory power; adaptive feedforward layered networks; discriminant analysis; linear-output networks; mean-square classifiers; pattern classification; mean-square error minimisation; familiar discriminant criterion; final hidden layer; feature extraction criterion; target coding scheme
H. Osman, M.M. Fahmy, "On the Discriminatory Power of Adaptive Feed-Forward Layered Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 16, no. 8, pp. 837-842, Aug. 1994, doi:10.1109/34.308481
Usage of this product signifies your acceptance of the Terms of Use.