next up previous contents
Next: 6 Effects of Population Up: 5 Genetic Lineages and Previous: 7 Discussion of Sampling   Contents

8 Summary

This chapter has described how genetic lineages can serve as a simple measure of diversity in a representation based on natural selection. Many possible uses exist in genetic programming for lineages, here a variant of selection was used to further describe the search process. The relationship between genetic lineage loss, diversity and fitness landscape was described. A correlated landscape, where the application of an operator causes small changes to fitness, will encourage the quicker loss of genetic lineages and diversity. A uncorrelated landscape will lose genetic lineages and diversity at a slower rate.

Lineage selection is used to increase diversity by reducing the selection pressure from the most fit to the fit and diverse. This has caused performance variance across three problem domains. The results were analysed in the light of previous research to conclude that, if genetic programming is viewed as performing a type of hill-climbing search, adding diversity can worsen fitness on some problems that clearly benefit from elitism in a hill-climbing environment. However, when deception is embedded into the problem, improving diversity may help avoid local optima (as in the Ant problem), or it may compound the deception by maintaining its presence (as in the Binomial-3 problem).

The last section of this chapter examined the sampling of unique tree shapes and unique behaviours in genetic programming. An enriched definition of behaviour provided more information about solutions than typical fitness functions. As the genetic programming algorithm requires the search of structure and content, it is important to understand issues such as deception and the effort the algorithm spends on searching different types of behaviours and structures. The behaviour sampling results showed sampling trends that help explain previous diversity research and suggests new ways to improve search.

Also, the sampling results showed that there are different behaviours with the same fitness values in all problems that genetic programming samples at much higher rates. If low phenotype diversity and entropy are likely indicators of deceptive regions of the search space, and considering the results from Chapter 4 showing a correlation between high values of these measures and better fitness, then better search is achieved when deceptive regions are avoided. Thus, adaptive measures that recognize the signs of deception could possibly help to improve search. Lastly, the structure sampling results showed that while bloat and code growth occur, fewer different unique tree shapes of these large sizes are sampled. Problems that require specific structures at these large sizes are likely to be more difficult for genetic programming.

In the next chapter, the relationship between diversity and other aspects of the search processed is examined. In particular, the effects of population diversity are examined with respect to code growth and problem difficulty.


next up previous contents
Next: 6 Effects of Population Up: 5 Genetic Lineages and Previous: 7 Discussion of Sampling   Contents
S Gustafson 2004-05-20