next up previous contents
Next: 6 Summary Up: 5 Discussion of a Previous: 2 A Model of   Contents

3 Recommendations

The relationship between size and fitness is complex. Previous research has investigated the external dependencies with nodes and the node-to-node interactions in regression problems. Shifting nodes between programs is likely to result in semantic changes during crossover. An increase in program size may help to buffer against detrimental semantic changes. However, where size is not a necessary part of the fitness function, an increase in size results in a computational cost which becomes an ever-increasing burden on simulation time. Additionally, the many attempts to reduce size that result in weaker fitness illustrate the negative consequences of directly pressuring size.

Given that genetic programming is likely to favour larger trees, several methods have been used in an to attempt to limit code growth. The standard method to deal with code growth is to place a limit on the size of individuals [Koza, 1992,Luke and Panait, 2002a,Poli, 2003]. This measure has been shown to be difficult to beat with respect to overall algorithm performance. However, ideal sizes for particular problems are generally unknown, and placing a limit on size can have unforeseen consequences as trees will find it harder to replicate within size limits. Also, when genetic programming is applied on more complex problems it is not clear how to scale the limits of size appropriately.

Parsimony pressure is a common method used to control size [Burke et al., 1998,Iba et al., 1994,Luke and Panait, 2002a,Luke and Panait, 2002b,Rosca, 1997a]. This method favours fitter and smaller individuals. The degradation in performance due to parsimony pressure is studied by Soule and Foster [Soule and Foster, 1998]. They noted that pressure to reduce size increases the failure of runs, where populations often converge to very small, poorly fit individuals. Several variations of such methods exist which vary the relative importance of size and fitness. Their drawback is that they require a large parameter search for optimality [Luke and Panait, 2002a]. In [Zhang and Mühlenbein, 1995], an adaptive pressure allowed an initial increase in size before increasing the parsimony pressure. This allowed the method to find better fitness but required problem specific tuning of parameters. Other methods add a size objective to the fitness objective, creating a multiobjective problem [de Jong et al., 2001,Ekárt and Németh, 2001]. Smaller size individuals and similar fitness with smaller population sizes are reported benefits of these methods. Using specific recombination operators and methods which combine like-sized individuals [Langdon, 2000b] or requiring an improvement in fitness in children [O'Reilly and Oppacher, 1995] are other approaches to reducing code growth. These methods attempt to reduce the amount of ineffective code. Generally, without a large parameter search or problem specific modifications, these methods are likely to reduce code growth but also worsen fitness.

However, high rates of growth do not seem to be necessary for solving more difficult instances, as many small good solutions are still found. Rather, an increased rate of growth appears to be the effect of higher selection pressure and lower genetic diversity. Methods which remove large individuals or pressure the population away from increased growth directly are likely to have negative effects. In these cases, good fitness is likely occurring in the neighbourhood of large individuals, and by biasing the search away from them, overall performance could be worsened. Therefore, the following is recommended as a method to reduce the rate of growth and the overall size of solutions.

Methods which work to reduce the rate of code growth more gradually or indirectly, such as the adaptive control of selection pressure and population diversity during recombination, may be a more effective strategy against code growth.


next up previous contents
Next: 6 Summary Up: 5 Discussion of a Previous: 2 A Model of   Contents
S Gustafson 2004-05-20