
This Article  
 
Share  
Bibliographic References  
Add to:  
Digg Furl Spurl Blink Simpy Del.icio.us Y!MyWeb  
Search  
 
ASCII Text  x  
Les M. Howard, Donna J. D'Angelo, "The GAP: A Genetic Algorithm and Genetic Programming Hybrid," IEEE Intelligent Systems, vol. 10, no. 3, pp. 1115, June, 1995.  
BibTex  x  
@article{ 10.1109/64.393137, author = {Les M. Howard and Donna J. D'Angelo}, title = {The GAP: A Genetic Algorithm and Genetic Programming Hybrid}, journal ={IEEE Intelligent Systems}, volume = {10}, number = {3}, issn = {08859000}, year = {1995}, pages = {1115}, doi = {http://doi.ieeecomputersociety.org/10.1109/64.393137}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, }  
RefWorks Procite/RefMan/Endnote  x  
TY  MGZN JO  IEEE Intelligent Systems TI  The GAP: A Genetic Algorithm and Genetic Programming Hybrid IS  3 SN  08859000 SP11 EP15 EPD  1115 A1  Les M. Howard, A1  Donna J. D'Angelo, PY  1995 VL  10 JA  IEEE Intelligent Systems ER   
The GAP performs symbolic regression by combining the traditional genetic algorithm's function optimization strength with the geneticprogramming paradigm to evolve complex mathematical expressions capable of handling numeric and symbolic data. This technique should provide new insights into poorly understood data relationships.
Discovering relationships has been a task troubling researchers since the dawn of modern science. Discovering relationships between sets of data is laborious and error prone, and it is highly subject to researcher bias. Because many of today's research problems are more complex than those of the past, it is increasingly important that robust data analysis methods be available to researchers. For a data analysis method to be most useful, it must meet at least three criteria: good predictive ability, insight into the inner workings of the system being analyzed, and unbiased results.
Historically, researchers deduced relationships solely by examining the dataa difficult task if the relationship is complex, if many variables are involved, or if the data are noisy (as often occurs in realworld problems). Moreover, the examination is easily influenced by the researcher's desires and expectations.
Statistical methods were among the first tools developed to help a researcher find the relationships of observed facts. Statistical methods are often based on such assumptions as these: (1) the data are normally distributed, (2) the equation relating the data is of a specific form (for example, linear, quadratic, or polynomial), and (3) the variables are independent. If the problem meets these assumptions, statistics are a valuable tool for providing static descriptors. But realworld problems seldom meet these criteria.
Neural networks, an artificial intelligence technique, are not limited by these assumptions. They serve as strong predictive models that can uncover complex relationships, but they give little insight into the underlying mechanisms that describe a relationship. However, two other nonstatistical AI techniques, genetic algorithms and genetic programming, are more robust methods of exploring complex solution spaces. Independently, they have had some success at revealing the mechanisms relating data items.
Recently, genetic algorithms, which use the principles of evolution through natural selection to solve problems, have established themselves as a powerful search and optimization technique. Most GAs are linear (the structure of an individual is a flat bit string). The basic GA proceeds as follows:
GAs have been used for everything from multiplefault diagnosis to medicalimage registration. They have shown themselves to be a superior tool for developing rulebased systems, capable of gleaning knowledge from data inaccessible to statistical methods. Goldberg thoroughly discusses genetic algorithms and their use as a problemsolving and function optimization technique. Goldberg and Forrest give additional examples.
Although linear GAs are adept at developing rulebased systems, they cannot develop equations. A recent addition to the evolutionary domain is genetic programming, which uses an evolutionary approach to generate symbolic expressions and perform symbolic regressions. However, the geneticprogramming method of performing symbolic regressions has some limitations. It can modify only the structure of an expression, not its contents, which is generated by the implementation program when the genetic programming starts. In performing symbolic regressions, genetic programming cannot deal with nonnumeric variables. It also tends to produce convoluted equations because it cannot modify the coefficients it uses (for example, a genetic program might use (2.523+2.523)/2.523 to represent the number 2).
We have developed a method combining the known strengths of traditional genetic algorithms with the new field of genetic programming to produce a superior tool for performing symbolic regressions. We call this tool the genetic algorithmprogram, or the GAP.