
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
A. Sakar, R.J. Mammone, "Growing and Pruning Neural Tree Networks," IEEE Transactions on Computers, vol. 42, no. 3, pp. 291299, March, 1993.  
BibTex  x  
@article{ 10.1109/12.210172, author = {A. Sakar and R.J. Mammone}, title = {Growing and Pruning Neural Tree Networks}, journal ={IEEE Transactions on Computers}, volume = {42}, number = {3}, issn = {00189340}, year = {1993}, pages = {291299}, doi = {http://doi.ieeecomputersociety.org/10.1109/12.210172}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  JOUR JO  IEEE Transactions on Computers TI  Growing and Pruning Neural Tree Networks IS  3 SN  00189340 SP291 EP299 EPD  291299 A1  A. Sakar, A1  R.J. Mammone, PY  1993 KW  neural tree networks; pattern classification method; feature space; class label; learning algorithm; classification errors; optimal pruning algorithm; function learning tasks; speaker independent vowel recognition; learning (artificial intelligence); pattern recognition; selforganising feature maps; trees (mathematics). VL  42 JA  IEEE Transactions on Computers ER   
A pattern classification method called neural tree networks (NTNs) is presented. The NTN consists of neural networks connected in a tree architecture. The neural networks are used to recursively partition the feature space into subregions. Each terminal subregion is assigned a class label which depends on the training data routed to it by the neural networks. The NTN is grown by a learning algorithm, as opposed to multilayer perceptrons (MLPs), where the architecture must be specified before learning can begin. A heuristic learning algorithm based on minimizing the L1 norm of the error is used to grow the NTN. It is shown that this method has better performance in terms of minimizing the number of classification errors than the squared error minimization method used in backpropagation. An optimal pruning algorithm is given to enhance the generalization of the NTN. Simulation results are presented on Boolean function learning tasks and a speaker independent vowel recognition task. The NTN compares favorably to both neural networks and decision trees.
[1] D. Rumelhart and J. McClelland,Parallel Distributed Processing. Cambridge, MA: M.I.T. Cambridge Press, 1986.
[2] P. Werbos, "Beyond regression: New tools for prediction and analysis in the behavioral sciences," Ph.D. dissertation, Harvard Univ., 1974.
[3] D. Parker, "Learning logic," Tech. Rep. TR47, M.I.T., Center for Computational Research in Economics and Management Science, 1985.
[4] L. Breiman, J. Friedman, R. Olshen, and C. Stone,Classification and Regression Trees. Belmont, CA: Wadsworth International group, 1984.
[5] R. O. Duda and P. E. Hart,Pattern Classification and Scene Analysis. New York: Wiley, 1973.
[6] K. Fukunaga,Introduction to Statistical Pattern Recognition. New York: Academic, 1972.
[7] A. Waibel, "Modular construction of time delay neural networks for speech recognition,"Neural Computation, vol. 1, Mar. 1989.
[8] A. Rajavelu, M. Musavi, and M. Shirvaikar, "A neural network approach to character recognition,"Neural Networks, vol. 2, no. 5, pp. 387394, 1989.
[9] Y. Le Cun et al., "Handwritten ZIP Code Recognition with Multilayer Networks,"Proc. 10th Int'l Conf. Pattern Recognition, Vol. 2, IEEE CS Press, Los Alamitos, Calif., Order No. 2063, 1990, pp. 3540.
[10] S. Judd, "Learning in networks is hard," inProc. IEEE First Int. Conf. Neural Networks, vol. 2, June 1987, pp. 685692.
[11] T. M. Cover, "Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition,"IEEE Trans. Electron. Comput., 1965.
[12] E. Baum, "On the capabilities of multilayer perceptrons,"J. Complexity, vol. 4, pp. 193215, Sept. 1988.
[13] L. Hyafil and R. Rivest, "Constructing optimal decision trees is NPcomplete,"Inform. Processing Lett., Vol. 5, no. 1, pp. 1517, 1976.
[14] S. Amari, "A theory of adaptive pattern classifiers,"IEEE Trans. Electron. Comput., vol. EC16, pp. 299307, June 1967.
[15] A. Sankar and R. Mammone, "A new fast learning algorithm for feedforward neural networks using the L1 norm of the error," Tech. Rep. CAIP TR115, CAIP Center, Rutgers Univ., 1990.
[16] M. Frean, "Small nets and short paths: Optimising neural computation," Ph.D. dissertation, Univ. Edinburgh, 1990.
[17] B. Juang, personal communication.
[18] A. Sankar and R. Mammone, "Neural tree networks," inNeural Network: Theory and Applications, R. Mammone and Y. Zeevi, Eds. New York: Academic, 1991, pp. 281302.
[19] S. Kirkpatrick, C. Gelatt, Jr., and M. Vecchi, "Optimization by simulated annealing,"Science, vol. 220, May 1983.
[20] A. Sankar and R. Mammone, "Optimal pruning of neural tree networks for improved generalization," inProc. IJCNN, July 1991, pp. II219II224.
[21] J. R. Quinlan, "Induction of decision trees,"Machine Learning, vol. 1, no. 1, pp. 81106, 1986.
[22] P. Utgoff, "Perceptron trees: A case study in hybrid concept representation," inProc. Seventh Nat. Conf. Artif. Intell., St. Paul, MN, MorganKaufman, 1988.
[23] J. Flanagan,Speech Analysis Synthesis and Perception. Berlin, Germany: SpringerVerlag, 1972.
[24] L. Rabiner and R. Schafer,Digital Processing of Speech Signals. Englewood Cliffs, NJ: PrenticeHall, 1978.
[25] R. P. Lippmann, "Review of neural networks for speech recognition,"Neural Computation, vol. 1, no. 1, pp. 138, 1989.
[26] K. Unnikrishnan, J. Hopfield, and D. Tank, "Connecteddigit speakerdependent speech recognition using a neural network with timedelayed connections,"IEEE Trans. Signal Processing, vol. 39, pp. 698713, Mar. 1991.
[27] K. Unnikrishnan, J. Hopfield, and D. Tank, "Speakerindependent digit recognition using a neural network with timedelayed connections,"Neural Computation, vol. 4, no. 1, 1991.
[28] A. C. Tsoi and R. Pearson, "Comparison of three classification techniques, CART, C4.5, and multilayer perceptrons," presented at the NIPS Post Conference Workshop, Denver, CO, 1990.
[29] L. Atlas, R. Cole, Y. Muthusamy, A. Lippman, J. Connor, D. Park, M. ElSharkawi, and R. M. II, "A performance comparison of trained multilayer perceptrons and trained classification trees,"Proc. IEEE, vol. 78, Oct. 1990.
[30] A. Robinson, "Dynamic error propagation networks," Ph.D. dissertation, Cambridge Univ. Eng. Dep., 1989.