next up previous contents
Next: 2 An analysis using Up: 1 Contributions Previous: 1 Contributions   Contents

1 A survey and analysis of diversity in genetic programming demonstrated the complexity behind the issue of diversity measures and methods and the relationship between diversity and fitness.

Several measures of diversity frequently used in the literature were surveyed and analysed in Chapter 4. Specifically, the most typical measures from the literature referring to genotype and phenotype traits were used in an experimental study. The general behaviour of these measures showed the initial difficulty in assessing and attributing run failure to diversity loss as many measures behaved unexpectedly. For example, the measure of unique genotypes typically remained at high levels after only a few generations and never increased or decreased significantly. As individuals became larger in size, maintaining distinct individuals was not generally difficult.

In Chapter 4, using a measure based on edit distance, populations were seen to generally lose most diversity early in the run and then remain at low levels. There was not a distinct phase transition of diversity loss that could be attributed to an expected time when the run would become stuck in local optima. Rather, low edit distance diversity is the result of selection, the representation and operator. In some cases, the increased loss of diversity was linked to improved search performance, demonstrated by the correlation between good fitness and low diversity.

Measures of particular importance were those based on fitness values. A high number of unique phenotypes and high entropy were both correlated with the best fitness found during the evolutionary process. Problems with discrete fitness spaces in particular, such as the Ant and Parity problems, cause selection methods to become more random when the population loses unique fitness values. The loss of unique fitness values can also increase the chance that selection will pick deceptive or conflicting individuals to search with. This explains why high fitness-based diversity measures correlated well with good fitness.

The previous methods used to control diversity, like those used to control code growth, are often heavy-handed. Such methods are likely to effect the run in many unexpected ways. Therefore, in Chapter 5, a measure of diversity that incorporated ancestry information and inheritance was used.


next up previous contents
Next: 2 An analysis using Up: 1 Contributions Previous: 1 Contributions   Contents
S Gustafson 2004-05-20