next up previous contents
Next: 3 Analysis of Results Up: 2 Empirical Analysis of Previous: 6 Edit Distance Two   Contents

2 Correlation Measures

An objective of this study is to quantify the importance and ideal levels of diversity, recorded by different measures on common problems. The primary test of the relationship between diversity and fitness will be the Spearman correlation measure [Siegel, 1956]. The Spearman measure ranks two sets of variables and tests for a linear relationship between the variables' ranks. Correlation is first examined to determine if two runs can be distinguished by their diversity in terms of which run is better. As interesting relationships could easily exist but may not necessarily be linear, a range of scatter plots of diversity measures and fitness are evaluated.
Figure 4.3: Examples of ranked correlation scatter plots between fitness and diversity, where low fitness (ideal) is ranked from 1 to 50 (with 50 runs total) and diversity is ranked from high (1) to low (50). The middle graph shows the case of no correlation where the points are aligned vertically or horizontally.
\begin{figure}\centerline{
\psfig{figure=chapters/ch4figs/correlation-example.eps,width=15.0cm}}\end{figure}

The Spearman correlation coefficient is computed as follows:

\begin{displaymath}
1 - \frac{6 \sum_{i=1}^{N}d_{i}^2}{N^3-N} ,
\end{displaymath}

where $N$ is the number of runs and $d_i$ is the distance between each population's fitness rank and diversity rank. A value of -1.0 represents negative correlation, 0.0 denotes no correlation and 1.0 demonstrates positive correlation. For the measures used here, when low best-of-generation fitness values, which will be ranked in ascending order (1=best,$\ldots$,50=worst), occur with high diversity, ranked in ascending order (1=lowest diversity and 50=highest diversity), the correlation coefficient should be strongly negative. Alternatively, a positive correlation indicates that either bad fitness accompanies high diversity or good fitness accompanies low diversity. Figure 4.3 shows the relationship between fitness on the X-Axis and diversity on the Y-Axis and the type of correlation that a scatter plot in these circumstances would indicate.


next up previous contents
Next: 3 Analysis of Results Up: 2 Empirical Analysis of Previous: 6 Edit Distance Two   Contents
S Gustafson 2004-05-20