next up previous contents
Next: 4 Biological Foundations of Up: 1 Previous Distributed Evolution Previous: 2 Genetic Programming Distributed   Contents

3 Speciation, Niching and Other Methods

The previous five models all used the concept of demes, while the following two rely more heavily on the notion of niches and do not use migration. Kishore et al. (2001) applied genetic programming to an $n$ class classification problem with a ``niching" technique and a parallel implementation. Niching allowed the populations to evolve independently of each other on different problems and in parallel. Bongard (1999) studied the effects of evolving two populations that attempt to solve a regression problem with two different Fourier functions. Improved solutions were found when populations sent individuals at the end of a run to a server. New runs were then seeded with both random individuals and individuals that were previous solutions to either problem.

Other methods have been used to allow populations to occupy separate areas of the search space and to promote diversity. Some of these methods were also discussed briefly in Chapter 4. Crowding [DeJong, 1975] in genetic algorithms requires new offspring to replace the most similar individual in a pool drawn from the population. Sharing, phenotype and genotype [Goldberg and Richardson, 1987,Deb and Goldberg, 1989], require individuals in a genetic algorithm to receive a lower fitness, determined by the size of the population that has the same fitness or are genetically similar. A mating restriction scheme was also investigated [Deb and Goldberg, 1989] to prevent dissimilar individuals from the likely destructive crossover operation in genetic algorithms. McKay (2000) applied the fitness sharing method to genetic programming and found increased diversity, error rate reductions and similar performance with smaller populations as benefits.

Sharing was also applied to genetic programming using an edit distance between trees [Ekárt and Németh, 2000]. Although solution quality was not improved, sharing did find similar solutions of smaller size. Structure fitness sharing [Hu et al., 2002] was applied to genetic programming to encourage the parameter search (functions and terminals) over similar structures (trees) while not requiring expensive edit distance measures. The method adjusts an individual's fitness according to how many other individuals have the same tree structure. Another method used to simulate niching or speciation is negative correlation [Liu et al., 2000]. In the context of learning ensembles of neural networks, ``negative correlation learning provides a novel way to decompose the learning task of the ensemble into a number of subtasks for different individual networks''. The resulting populations were then divided into species, where a representative from each species was included in the ensemble. McKay and Abbass (2001a, 2001b) investigated negative correlation for genetic programming on the Multiplexer problem as a way to improve diversity and prevent premature convergence.

Edmonds (2001) applied genetic programming on the sun spot prediction problem by allowing individuals to find a problem niche. By creating a number of ``models" (here model is used to specify an area of the problem to solve, or niche) and selecting individuals for fitness evaluation based on their neighborhood and proximity to the niche, Edmonds hoped to allow individuals to orient themselves near those niches which are most effective. Li et al. (2002) implemented a ``species conserving'' genetic algorithm and tested it on several multimodal and deceptive problems. Species were found in the panmictic population by a distance metric. Species ``seeds'' were individuals that were at least as fit or more fit than the rest of the species. These individuals were then ``conserved'', or copied, into the next generation.

Lastly, a range of operators have been proposed to preserve the context or similarity of exchanged genetic material [D'haeseleer, 1994,Poli and Langdon, 1998a,Langdon and Poli, 2002,Page et al., 1999,Poli and Langdon, 1998b,Platel et al., 2003]. Like distributed models, these methods attempt to improve the ability of genetic programming to sample good regions of the search space effectively. While distributed models hope to allow semi-isolated subpopulations to better sample the search space prior to convergence, context preserving and homologous recombination operators attempt to sample the search space more effectively with a better designed operator. Both similar and dissimilar mating has been shown to improve search, often producing smaller in size solutions with similar fitness [Ekárt and Németh, 2000,Ryan, 1994]. Chapter 5 results, produced using lineage selection, also showed how mating (likely) dissimilar individuals can lead to smaller solutions with some loss of solution quality. Also, while specialised operators may seem intuitively better, they are often computationally expensive and the results unclear.


next up previous contents
Next: 4 Biological Foundations of Up: 1 Previous Distributed Evolution Previous: 2 Genetic Programming Distributed   Contents
S Gustafson 2004-05-20