An Analysis of Diversity in Genetic Programming

Steven Matt Gustafson

February 2004

University of Nottingham

School of Computer Science & IT


Genetic programming is a metaheuristic search method that uses a population of variable-length computer programs and a search strategy based on biological evolution. The idea of automatic programming has long been a goal of artificial intelligence, and genetic programming presents an intuitive method for automatically evolving programs. However, this method is not without some potential drawbacks. Search using procedural representations can be complex and inefficient. In addition, variable sized solutions can become unnecessarily large and difficult to interpret.

The goal of this thesis is to understand the dynamics of genetic programming that encourages efficient and effective search. Toward this goal, the research focuses on an important property of genetic programming search: the population. The population is related to many key aspects of the genetic programming algorithm. In this programme of research, diversity is used to describe and analyse populations and their effect on search. A series of empirical investigations are carried out to better understand the genetic programming algorithm.

The research begins by studying the relationship between diversity and search. The effect of increased population diversity and a metaphor of search are then examined. This is followed by an investigation into the phenomenon of increased solution size and problem difficulty. The research concludes by examining the role of diverse individuals, particularly the ability of diverse individuals to affect the search process and ways of improving the genetic programming algorithm.

This thesis makes the following contributions: (1) An analysis shows the complexity of the issues of diversity and the relationship between diversity and fitness, (2) The genetic programming search process is characterised by using the concept of genetic lineages and the sampling of structures and behaviours, (3) A causal model of the varied rates of solution size increase is presented, (4) A new, tunable problem demonstrates the contribution of different population members during search, and (5) An island model is proposed to improve the search by speciating dissimilar individuals into better-suited environments.

Currently, genetic programming is applied to a wide range of problems under many varied contexts. From artificial intelligence to operations research, the results presented in this thesis will benefit population-based search methods, methods based on the concepts of evolution and search methods using variable-length representations.


S. Gustafson. 2004. An Analysis of Diversity in Genetic Programming. Ph.D. Dissertation, School of Computer Science and Information Technology, University of Nottingham, Nottingham, U.K

Other Formats:


© S Gustafson 2004-05-20