next up previous contents
Next: 6 Recombination, Mutation and Up: 3 Genetic Programming Previous: 4 Initialisation   Contents

5 Fitness and Selection

Genetic programming requires an evaluation, or fitness, function to determine the quality of an individual. The fitness value is typically a single scalar value that allows selection methods to distinguish between various levels of individual quality. Multiobjective methods often use a vector to represent fitness. Most of the computation time, particularly for more complex problem domains, is spent in the fitness evaluation. For instance, in a robotic control problem, each individual might represent an algorithm to move a robot from one location to another. Each individual must then be run for the purpose of fitness evaluation in a simulator or on an actual robot, where both tasks may take considerable time.

To determine a fitness value, the individual is typically applied to a fitness case, or set of fitness cases. The score on the fitness case(s) is represented in the final fitness function value. For example, in the above robotic control problem example, executing a program that controls a robot in a simulator would represent a fitness case. The fitness value assigned to the robot on this fitness case could represent the distance from some predefined location and the final location of the robot. If this process was repeated several times by starting the robot in different initial locations, the values returned by these fitness cases could be averaged to represent the fitness value assigned to the individual. In the problem domains described later in this chapter, a fitness value will sometimes be the result of applying a program to one fitness case and sometimes it will be the result of applying the program to a set of fitness cases.

The design of a good representation and fitness function is critical. Ideally, the representation and fitness should have the property of strong causality where small changes in the genotype represent small changes in fitness [Rosca and Ballard, 1995]. However, weak causality, the opposite of strong causality, is more common in genetic programming. Weak causality implies more ``instability'' of the programs evolved by genetic programming, where small changes to solutions often yield large changes in fitness.

While fitness functions are designed to distinguish between individual quality, code growth and other dynamics can also be controlled with components added to the fitness function [Luke and Panait, 2002b]. Dynamic fitness functions and hierarchically defined fitness functions [Langdon and Poli, 2002] are examples of other ways to guide the search.

To create the next population of individuals in a generational algorithm, individuals are probabilistically selected from the current population based on fitness. The most common method in genetic programming is tournament selection. Tournament selection chooses $t$ individuals from the current population and returns the one with the best fitness. Typically, one tournament is held for each individual participating in crossover, mutation or reproduction. Other forms of selection consist of truncation selection, fitness-proportionate selection, ranked selection and selection based on multiple criteria (for example, selecting individuals based on fitness and then on size). See [Blickle and Thiele, 1995] for a discussion of selection methods.


next up previous contents
Next: 6 Recombination, Mutation and Up: 3 Genetic Programming Previous: 4 Initialisation   Contents
S Gustafson 2004-05-20