next up previous contents
Next: 2 Correlation Measures Up: 1 Diversity Measures Used Previous: 5 Edit Distance One   Contents

6 Edit Distance Two (weighted)

Edit distance Two diversity is based on the depth-weighted edit distance between individuals used by Ekárt and Németh (2000). This measure (denoted ``ed 2'') is adapted back to its original formulation [Nienhuys-Cheng, 1997] where the difference between any two non-equal nodes was 1. Using a value of $k=\frac{1}{2}$ gives differences near the root more weight.

In [Wineberg and Oppacher, 2003] an $O(n+m)$ inter-population diversity method is developed based on pair-wise distance by counting the frequencies of symbols for each position in the genome. While a similar method could be found for genetic programming syntax trees, the variable length and size of symbol sets would make this calculation more complex. To reduce computation time here, an approximate population diversity measure is found by only comparing each population member against a single tree. Every individual in the population is compared with the best fit individual found so far in the run. This measure is then divided by the population size.

Both edit distance One and Two are vulnerable to outliers, especially when the best fit individual is the outlier. However, previous experimental results display two key properties which make these measures appropriate and representative. First, even if the best fitness is found in the initial population, an individual in the current generation is considered to be the best of the run if it is at least as good as the current best of the run individual. Secondly, with probabilistic selection based on fitness, the best individual is likely to contribute several offspring to the next generation and is unlikely to remain the outlier for long. Later chapters also consider the best individual in the current generation for these measures. Another reason for using the best of run individual in this chapter is that it is common for researchers to consider this individual during analysis rather than the best in the current generation.


next up previous contents
Next: 2 Correlation Measures Up: 1 Diversity Measures Used Previous: 5 Edit Distance One   Contents
S Gustafson 2004-05-20