Next: 1 Population Measures: Entropy
Up: 6 Effects of Population
Previous: 1 Code Growth and
  Contents
Building on the study by Daida et al. (2001),
the Binomial-3 problem with varying ERC ranges is used to increase
difficulty. As in Chapter 5, the Binomial-3 problem consists of approximating the
function
using the functions
and the terminals
.
are ERCs in the ranges of
[-1,1) (referred to lated as unity), [-10,10) (ten),
and [-100,100) (`hundred'), where larger ranges create increasing
difficulty.
The Binomial-3 function is shown graphically over
the interval [-1,0] in Figure 6.1 (top left). There are
several solutions to this function which
can be represented easily by many different combinations of functions
and terminals.
The easiest version of
this problem would be expected to allow for many close-to-optimal
initial solutions
and more new optimal solutions to be acquired
in every generation. Harder instances are likely to cause more
difficulty in finding good solutions early and often.
To complement this problem, a
method is used to generate random polynomial instances
[Ekárt and Németh, 2002].
Random polynomials are generated up to degree 11 with
non-zero coefficients and with real roots in the range
.
As higher degree polynomials have a larger number of roots close to zero, most of the values they take in the
interval are close to zero.
For genetic programming, it is easy to find relatively close fit solutions in
the form of straight lines close to the X axis. However, it is difficult to
improve on these solutions to find a
really good approximation, especially if the relatively fit and simple
individuals
spread across the genetic programming population after a few generations.
The polynomials of increasing degree with non-zero coefficients and real roots
in the
interval constitute a class of increasing difficulty
problem instances which are suitable for studying the behaviour of
genetic programming, as they have the following properties:
- Minimum Required Depth.
Higher degree polynomials can be approximated less easily with
our function set. Each term of degree
is formed by a sequence
of
. This, in general,
will require larger solutions for a good approximation of a higher degree
polynomial.
A polynomial with real roots and non-zero coefficients
can be
expressed as a tree by using
functions (i.e.,
multiplications and
subtractions) and
terminals.
The minimum depth (thus complete) tree representing this polynomial
is of depth
.
- Maximum Allowed Depth.
It follows from the above requirement that
polynomials with higher degree are harder for genetic programming
to approximate if allowing the same limited depth for trees, as there is less
space for genetic programming to grow introns. Introns refer to
non-functional code in the program. In these cases the trees must
be more efficient in using more nodes to express the solution.
- Approximation Ability.
Lastly, the Matlab Polyfit routine is used to assign
a degree of approximation ability to the polynomials. Polyfit
finds the best polynomial of given degree to fit
a set of points.
The polynomials of degrees 11, 7 and 3 (shown in Figure 6.1) were best fitted
as polynomials of degrees 11, 8 and 3, respectively, with
errors of order
, the approximation error being the highest for
the 11 degree polynomial. That is, there were no close approximate polynomials
of degree
to generated polynomials of degree
with
.
Furthermore, as the generated polynomials are best fit with Polyfit polynomials
of similar degree, a successful application of genetic programming on this
problem should
also find polynomials of similar degree.
The type of increased difficulty posed by the random polynomials and
the Binomial-3 function is similar. The Binomial-3 problem must
cope with ERC values spanning an increasing range.
Likewise, for the random polynomials, when the degree increases,
the fixed ERC range will become less appropriate as the polynomials
contain variations of smaller magnitude.
For assessing the differences in behaviour between easy and difficult
instances, two measures of diversity are used.
The entropy measure is used for indicating the distribution of
individuals over fitness values (and subsequently the level of selection
pressure) and the edit distance measure is used for indicating
the structural diversity of a population.
These measures are described next.
Subsections
Next: 1 Population Measures: Entropy
Up: 6 Effects of Population
Previous: 1 Code Growth and
  Contents
S Gustafson
2004-05-20