next up previous contents
Next: 2 Problem Difficulty Up: 2 The Effects of Previous: 2 The Effects of   Contents

1 Code Growth

Code growth is a well-documented phenomenon of evolutionary systems using a variable-length representation, as is used in genetic programming. When an increase in solution size does not correspond with fitness improvement then computation time is wasted and the readability of solutions is decreased. Several recent studies of the causes of code growth exist. Luke (2003) presented a theory based on the depth of modification points and Soule and Heckendorn (2002) assessed the viability of three hypotheses: the protective, the drift and the removal bias. There are two major contributing factors to code growth in genetic programming emphasised in these studies:
  1. The smaller the removed portion of a tree the less likely that the fitness of the resulting offspring will be worse than the parent's according to the removal bias [Soule and Heckendorn, 2002,Igel and Chellapilla, 1999].
  2. Child survivability is linked to deeper nodes selected for modification in their parents according to the depth-based theory of code growth [Luke, 2003]. This gives a bias toward larger parents which will tend to produce larger children.
These two points are overlapping, and are also somewhat consistent with other theories of code growth, specifically Langdon and Poli's fitness causes bloat theory [Langdon, 2000a,Langdon and Poli, 1998a,Langdon et al., 1999], which suggests that the space of larger trees contains more better fit programs. As fitness increases or stays the same with tree size growth, the selection of crossover points nearer to the leaves increases, the size of the removed portion decreases and the modification points are deeper. Thus, as trees grow, their modification points become deeper, their removed subtrees become smaller and their children's survivability is higher. There is a clear bias toward a constant rate of growth during evolution.

However, empirical evidence also shows a wide range of code growth rates that these theories do not predict. In a series of random runs, genetic programming may produce a wide range of solution sizes. For example, Figure 4.5 in Chapter 4 shows the average size of an individual in each population for 50 random runs. In each of the experiments, a wide range of sizes are produced. Understanding the mechanics of the population that allowed for smaller solutions and the opposite mechanics that caused larger solutions seems paramount to controlling code growth while maintaining fitness improvements. If solutions are predisposed toward growth due to the recombination of tree types, controlling population diversity could allow better control of code growth. Also, understanding the effects and causes of different levels of population diversity could allow a more thorough description of the mechanisms that cause growth.


next up previous contents
Next: 2 Problem Difficulty Up: 2 The Effects of Previous: 2 The Effects of   Contents
S Gustafson 2004-05-20