next up previous contents
Next: 2 Evolving Populations' Correlation Up: 3 Analysis of Results Previous: 3 Analysis of Results   Contents

1 Correlations in Final Populations

Table 4.2 summarises the Spearman correlation coefficients between the best fitnesses in 100 runs and the diversity in the final populations. Also the correlation between diversity measures is reported.

Table 4.2: Spearman correlation coefficients for the Ant, Quartic, Rastrigin and Parity experiments.
The Ant Problem
fitness phenes genes p-isom entropy ed 1

phenes

-.3936 - - - - -
genes .1962 -.4950 - - - -
p-isom .4009 -.6389 .6949 - - -
entropy -.3615 .9039 -.5724 -.7569 - -
ed 1 .4205 -.5040 .2991 .3998 -.4891 -
ed 2 .4606 -.4537 .4702 .5603 -.4949 .7504
The Quartic Problem
fitness phenes genes p-isom entropy ed 1

phenes

.4345 - - - - -
genes -.1363 -.0353 - - - -
p-isom -.0300 .1588 .8408 - - -
entropy .3924 .9730 -.1712 .0070 - -
ed 1 -.1640 .0045 .2290 .3150 -.0191 -
ed 2 -.0881 -.0273 .1554 .2182 -.0461 .6891
The Rastrigin Problem
fitness phenes genes p-isom entropy ed 1

phenes

-.0616 - - - - -
genes -.1305 .7089 - - - -
p-isom -.2262 .5521 .6163 - - -
entropy -.0402 .9688 .7324 .5525 - -
ed 1 -.0530 -.0365 .2056 .2014 .0460 -
ed 2 -.0762 .1185 .3265 .3828 .1750 .6514
The Parity Problem
fitness phenes genes p-isom entropy ed 1

phenes

-.7803 - - - - -
genes -.0641 .0510 - - - -
p-isom .0773 .0646 .5132 - - -
entropy -.7146 .7048 -.0379 .0204 - -
ed 1 .3235 -.2156 .1178 .4483 -.3062 -
ed 2 .0148 -.0087 .2656 .5377 -.0626 .7265

In the Ant experiments, negative correlation is seen between phenotypes and fitness and also between entropy and fitness. Good (low) fitness is seen with high phenotype diversity and entropy. There is a positive correlation of edit distance with fitness and also between pseudo-isomorphs and fitness. Only very weak correlation is seen between genotypes and fitness in Ant experiments, which is the trend for all the experiments. In this case, a positive correlation between fitness and edit distance and between fitness and pseudo-isomorphs suggests that low (good) fitness is seen with low diversity. From Figures 4.9 and 4.10, edit distance generally decreases during the run. While runs tend to structurally converge for the Ant experiments, and with respect to the edit distance One measure in the Parity experiments, those which converge more often have better fitness. This may be the result of good fitness being found early in the run, or it may be the result of convergence leading to better fitness.

The table of correlation coefficients in Table 4.2 also gives the Rastrigin experiment results. This table shows the lack of strong correlations between diversity and fitness (the same effect is partially seen in the Quartic experiments as well). It may be the case that a correlation did exist between fitness and diversity, the relationship is not linear or the final populations have lost any correlation due to the repeated application of selection and recombination without change in fitness.

The importance of phenotype diversity is now seen with the Parity experiments in Table 4.2, where a strong negative correlation exists between fitness and phenotype diversity. Figure 4.7 shows that phenotype diversity tends to increase in the Parity experiments. With only 32 possible fitness values in the problem, the population begins with random guesses near a fitness of 16. As populations undergo selection and recombination, the number of unique fitness values increases from 3-4 to 6-13. Without some increase in phenotype diversity, genetic programming cannot distinguish between good individuals and bad ones.

As tournament selection uses the fitness values of an individual to decide tournaments, fewer unique phenotypes in the population (and the lower the entropy) will make selection more random. That is, selection will be faced with many individuals that have the same fitness. Therefore, if high phenotype diversity and entropy are maintained, selection pressure remains at the pre-set level. The lowering of phenotype diversity and entropy might actually benefit some problems where less selection pressure is suitable, but negatively affect others where higher selection pressure is better.

Table 4.2 also gives the correlation between measures of diversity. In the Ant experiments, note that more phenotype diversity negatively correlates with the structural measures (genotypes, pseudo-isomorphs, and the edit distances). An increase (or decrease) of unique fitness values in the population corresponds with a decrease (or increase) in the structural diversity. This seems counter-intuitive as more unique genotypes should correspond to more unique fitness cases. This behaviour is expected with the edit distance measures as these measures generally decrease during evolution while phenotype diversity increases. In this problem, the discovery of different fitness values appears to be aided by less structural diversity. That is, if the population is structurally similar, it is easier to find more unique fitness values. Possible hypothesese for this behaviour include: a better environment for crossover, less deception in the search space or a more focused local search phase.

Figure 4.13: Evolving populations' correlation between best fitness in each population and different diversity measures. Each point represents the correlation between 100 populations from a 100 runs, each of the 50 generations are represented.
\begin{figure}\centerline{
\psfig{figure=chapters/ch4figs/ant-sg.eps,height=5.5c...
...5cm}
\psfig{figure=chapters/ch4figs/rastrigin-sg.eps,height=5.5cm}
}\end{figure}


next up previous contents
Next: 2 Evolving Populations' Correlation Up: 3 Analysis of Results Previous: 3 Analysis of Results   Contents
S Gustafson 2004-05-20