Next: 2 Problem Difficulty
Up: 2 The Effects of
Previous: 2 The Effects of
  Contents
Code growth is a well-documented phenomenon of evolutionary systems
using a variable-length representation, as is used in
genetic programming. When an increase in solution size does not correspond with
fitness improvement then computation time is wasted and the
readability of solutions is decreased.
Several recent studies of the causes of code growth exist. Luke (2003)
presented a theory based on the depth of modification points
and Soule and Heckendorn (2002) assessed the viability
of three hypotheses: the protective, the drift and the removal
bias.
There are two major contributing factors to code growth in genetic
programming emphasised in these studies:
- The smaller the removed portion of a tree
the less likely that the fitness of the
resulting offspring will
be worse than the parent's according to the
removal bias [Soule and Heckendorn, 2002,Igel and Chellapilla, 1999].
- Child survivability is linked to deeper nodes selected
for modification in their parents
according to the depth-based theory of code growth [Luke, 2003].
This gives a bias toward larger parents which will tend to produce
larger children.
These two points are overlapping, and
are also somewhat consistent with other theories of code growth,
specifically Langdon and Poli's fitness causes bloat theory
[Langdon, 2000a,Langdon and Poli, 1998a,Langdon et al., 1999], which suggests that the
space of larger trees contains more better fit programs.
As fitness increases or stays the same with tree size growth,
the selection of crossover points
nearer to the leaves increases, the size of the removed portion
decreases and the modification points are deeper.
Thus, as trees grow, their modification points become deeper, their
removed subtrees become smaller and their children's survivability
is higher.
There is a clear bias toward a constant rate of growth during evolution.
However, empirical evidence also shows a wide range of
code growth rates that these theories do not predict.
In a series of random runs, genetic programming may produce a wide
range of solution
sizes.
For example, Figure 4.5 in Chapter 4 shows the average size of an
individual in each population for 50 random runs. In each of
the experiments, a wide range of sizes are produced.
Understanding the mechanics of the population that allowed
for smaller solutions and the opposite mechanics that caused larger
solutions seems paramount to controlling code growth while maintaining
fitness improvements. If solutions are predisposed toward
growth due to the recombination of tree types, controlling population
diversity could allow better control of
code growth. Also, understanding the effects and causes of different
levels of population diversity
could allow a more thorough description of the mechanisms that
cause growth.
Next: 2 Problem Difficulty
Up: 2 The Effects of
Previous: 2 The Effects of
  Contents
S Gustafson
2004-05-20