
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Vincent W. Porto, David B. Fogel, Lawrence J. Fogel, "Alternative Neural Network Training Methods," IEEE Intelligent Systems, vol. 10, no. 3, pp. 1622, June, 1995.  
BibTex  x  
@article{ 10.1109/64.393138, author = {Vincent W. Porto and David B. Fogel and Lawrence J. Fogel}, title = {Alternative Neural Network Training Methods}, journal ={IEEE Intelligent Systems}, volume = {10}, number = {3}, issn = {08859000}, year = {1995}, pages = {1622}, doi = {http://doi.ieeecomputersociety.org/10.1109/64.393138}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  MGZN JO  IEEE Intelligent Systems TI  Alternative Neural Network Training Methods IS  3 SN  08859000 SP16 EP22 EPD  1622 A1  Vincent W. Porto, A1  David B. Fogel, A1  Lawrence J. Fogel, PY  1995 VL  10 JA  IEEE Intelligent Systems ER   
This article investigates three potential neural network training algorithms in processing active sonar returns. Our tests show that although all three methods can generate reasonable probabilities of detection and false alarm in discriminating between manmade objects and background events, the stochastic training methods of simulated annealing and evolutionary programming can outperform back propagation.
Neural networks are parallel structures made up of nonlinear processing nodes that are connected by fixed or variable weights. They can provide arbitrarily complex measurable decision mappings and are well suited for use in pattern recognition. Neural network paradigms generally fall into two categories: unsupervised and supervised learning.
Unsupervised learning networks use the neural network topology as a selforganizing clusterer where decision regions are created with respect to the similarity of the input exemplars to other previously observed exemplars. The network adapts its outputs to minimize a function of the spacing between the elements in each organized cluster of data points. This paradigm requires no a priori clustering information.
Supervised learning networks operate on input exemplars that are associated with a desired output. Such learning involves iteratively training the neural network to approximate this mapping. System designers often use multilayer perceptrons (see Figure 1) in supervised learning applications because they can perform arbitrarily complex decision mapping. While this article restricts itself to supervised learning networks, the stochastic training methods investigated also apply to other topologies and unsupervised algorithms.
Multilayer perceptrons allow for continuousvalued inputs and outputs. The internal layers transform the input vectors into the output space. Internal weights and bias terms define the output of the network given the presented exemplar. This mapping of input exemplars on the network topology defines a response surface over an ndimensional hyperspace where there are n weight and bias terms to be adapted. Various algorithms can be used to search for the set of weights and biases that minimize selected functions of the error between the actual and desired outputs (such as the mean squared error).
Typical response surfaces often possess local minima. Optimization techniques based on gradient descent may stagnate at these potentially suboptimal solutions, rendering the network incapable of sufficient performance. Secondorder Newton and quasiNewton methods may also fall prey to such entrapment.
Many tricks have been invented for avoiding this problem, such as restarting with a new random set of weights, training with noisy exemplars, and perturbing the weights when they appear to prematurely converge. While these methods may lead to improved solutions, there is no guarantee that such minima will not also be only locally optimal. Further, the same suboptimal solution may be rediscovered, leading to fruitless oscillatory training behavior.
Stochastic techniques offer an alternative to conventional gradient methods. Both simulated annealingand simulated evolution can serve to generate weight and bias sets. Neither suffers from entrapment in local minima (that is, they have asymptotic global, as opposed to only local, convergence properties).
This article assesses three methods of training neural networks, (1) back propagation, (2) simulated annealing, and (3) evolutionary programming, in terms of their performance in distinguishing between sonar returns that reverberate from a manmade metal sphere and those caused by sea mounts, fish and plant life, background noise, or similar anomalies. Although we focus on sonar signal processing, the results extend to more general applications of pattern classification.