next up previous contents
Next: 2 Empirical Analysis of Up: 1 Diversity Measures Previous: 1 Diversity Measures   Contents

1 Promoting Diversity

The canonical view of evolution and diversity is that more diversity will provide more opportunities for evolution. However, typical evolutionary algorithms contain a phase of exploration followed by exploitation [Eiben and Schippers, 1998]. Promoting all kinds of diversity during the entire search process could be counter-productive to the exploitation phase. The type and amount of diversity required at different times remains unclear. However, several measures and methods have been used to promote diversity. These methods typically use a non-standard selection, mating, or replacement strategy to increase or control diversity. Common methods include geographical distributions of individuals that limit their interactions, such as neighbourhoods [Collins, 1992] and islands [Martin et al., 2000]. Other methods consider behaviour and structure similarities, such as sharing [Goldberg and Richardson, 1987], crowding [DeJong, 1975] and genotype sharing [Deb and Goldberg, 1989]. These techniques were initially applied in genetic algorithms.

Eshelman and Schaffer (1993) investigated the advantage of pair-wise mating in genetic algorithms. The authors used Hamming distances to select individuals for recombination and replacement, finding improvement over hill-climbing-type selection strategies for genetic algorithms. Ryan's ``Pygmie'' algorithm (1994) addressed premature convergence and elitism in small populations for evolving minimal sorting networks. To facilitate selection for reproduction, the algorithm builds two lists: one based on fitness and the other on length. Ryan's algorithm maintained higher diversity and prevented premature convergence using simple measures. De Jong et al. (2001) used multiobjective optimisation for the n-Parity problem to promote diversity and concentrate on non-dominated individuals, according to a 3-tuple of $<$fitness, size, diversity$>$. Diversity was the average square distance to other members of the population, using a specialised measure of edit distance between nodes. This multiobjective method promoted smaller and more diverse trees.

Keller and Banzhaf (1995) described a structural difference measure based on edit distance between genotypes. The measure was more complicated than standard edit distance, but was intended for explicitly controlling the diversity of populations. Brameier and Banzhaf (2002) used a string edit distance on the effective portions of their Linear Genetic Programming individuals, measuring the distance between program code that contributed toward fitness. They used their measure in a two-level tournament, selecting for fitness and then for diversity.

McKay (2000) applied the fitness sharing concept from Deb and Goldberg (1989) to test its feasibility in genetic programming. Fitness sharing was credited with maintaining population diversity, allowing performance improvement and population size reduction for the Multiplexer and recursive list membership problems. Diversity was the number of fitness cases solved, where the sharing concept assigned a fitness based on an individual's performance divided by the number of other individuals with the same performance. Also, McKay studied negative correlation from [Liu et al., 2000] and a root Quartic negative correlation (2001a,2001b) to preserve diversity on the Multiplexer problem with mixed results. Similarly, a selection method that is uniform over the fitness values was suggested as an alternative way to preserve diversity [Hutter, 2002]. Ekárt and Németh (2000) applied fitness sharing with a novel tree distance definition to a symbolic regression problem, suggesting that it may be an efficient measure of structural diversity. Their results showed promise for controlling the size of programs, although not initially improving performance. The authors then applied their measure between every pair of individuals in a weighted arithmetic mean to develop a population diversity measure [Ekárt and Németh, 2002]. This measure was used to adaptively control diversity for explorative and exploitative search phases, noting the conflict between fitness improvement and high diversity observed in previous work. The authors found that on their symbolic regression instances, fitness sharing was able to improve accuracy and maintain population diversity.

Bersano-Begey (1997) tracked the number of individuals that solved specific fitness cases. The method promoted diversity and the discovery of different or less popular solutions. This was similar to the Stepwise Adaptation of Weights technique for constraint satisfaction and symbolic regression instances [Eiben and van Hemert, 1999,Eggermont and van Hemert, 2001]. Smith et al. (1993) investigated diversity within their immune system algorithm for classifier systems, based on a standard genetic algorithm. Their task was not concerned with traditional optimisation and required diverse populations to be successful. A speciation tree using Euclidean distance was applied by Bessaou et al. (2000) in a study on multimodal optimisation with island models. Their algorithm divided individuals into species, evolved them with a genetic algorithm and then redistributed them into new species. Geard and Wiles (2002) counted unique genotypes while studying recombination and diversity for a genetic algorithm solving the ``royal staircase'' problem.

Fernandes and Rosa (2001) looked at varying population sizes and non-random mating to maintain diversity for the Royal Road problem. Their negative assortative mating looks for genotypes with maximal Hamming distances. Darwen and Yao (2001) studied cooperation in the Iterated Prisoner's Dilemma problem and found that increasing behavioural diversity, not genetic diversity, can improve cooperation and performance. The authors also commented on the dogma surrounding diversity and some previous methods to maintain diversity [Darwen and Yao, 2000]. Ursem (2002) cited the importance of high and low diversity phases in an evolutionary strategy framework. The author used a ``distance-to-average-point'' diversity measure for the real-value encoded individuals. Depending on whether the diversity is in a predefined high or low diversity phase, different recombination operators were used, allowing diversity to fall or promoting more diversity.


next up previous contents
Next: 2 Empirical Analysis of Up: 1 Diversity Measures Previous: 1 Diversity Measures   Contents
S Gustafson 2004-05-20