Next: Methods for representing cost Up: Visualisation of Hierarchical Cost Previous: Visualisation of Hierarchical Cost

Introduction

A variety of techniques exist for studying the structure and properties of cost surfaces (fitness landscapes). Such approaches can provide insights for selecting appropriate search processes. The No Free Lunch theorem asserts the equality of all search processes when averaged over all cost surfaces [1]. However, it is arguable that not all cost surfaces are equally likely, so that some optimisation processes will be more effective than others in practice [2]. For efficient/effective optimisation, an algorithm (or its parameters) should be tailored to the features of the cost surface.

For complex families of cost surfaces, it is often a non-trivial problem to find a matching algorithm. One approach compares the empirical performance of a variety of algorithms on a suite of test functions [3]. For this approach to be informative about novel cost surfaces, there must be some principle suggesting appropriate similarities between the benchmark problem and the new problem. Knowing which metric to use to judge the similarity of two cost surfaces requires an understanding of why an algorithm performs well on a particular problem.

One barrier to understanding why an algorithm is effective is in characterising features of the cost surface. For large (benchmark) problems, where it may be too computationally expensive to compute the entire surface, one approach is to describe the surface with a set of estimated statistics, such as the number of local optima, and their relative distance. Such explorations of the cost surface have been used for describing NK landscapes [4].

An alternative approach to understanding cost surfaces, applicable to smaller benchmark problems, is the use of visualisation techniques. Visualising cost surfaces for problems of more than a few dimensions is a difficult task for the human perceptual system. Unfortunately, as is well known, surfaces in 2- or 3-D, which are most intuitive for humans, can be misleading with respect to properties of higher dimensional spaces. Ideally a benchmark problem should be small enough to be efficiently computed, while remaining complex enough to provide insights into scaled-up versions.

A wide variety of techniques exist for visualising multivariate data (see [5] for a classic guide). These include dimension-reduction techniques such as PCA and multi-variate scaling (e.g., [6]), graph visualisation (e.g., [7]), and animation (e.g., [8]). Specialist techniques have also been developed in many domains. However, many techniques for visualising high-dimensional data are less well suited to visualisation of high-dimensional surfaces.

For different domains, the types of displays that are effective depend on the problems that are addressed and the simplifications that can be made without loss of information. This paper considers cost surfaces that have been proposed in the literature as being amenable to search by evolutionary computation, particularly using the crossover operator.

One of the early tenets of evolutionary computation was the ``building block hypothesis'' [9]. This hypothesis proposes that a characteristic of many real-world problems is that solutions may be constructed from progressively higher-level combinations of building blocks. These blocks must be easy to identify (once discovered) and easily recombined into a wide variety of structures. Holland [10] reasons that, ``once a computer scientist starts thinking about building blocks as a source of innovation, the next obvious step is to look for algorithms that can discover and exploit building blocks'' (p374).

Holland argues that many problems exhibit a hierarchically structured cost surface, and argues that optimisation algorithms should behave accordingly, claiming that genetic algorithms do so because of the crossover operator of reproduction which allows the recombination of disparate building blocks. The performance of genetic algorithms on such hierarchically structured cost surfaces has been studied using a variety of test functions including the Royal Road problems [11,12]. The original Royal Road problem was proposed to study recombination using crossover, but in the search for clarity, it had only one fitness peak. It was discovered that hill-climbing strategies worked very well, better in fact than crossover (see [13] for a good summary of the reasons).

Goldberg, Deb and Korb [14] argued that demonstrating the power of crossover requires the use of deceptive functions in which the fitness landscape includes local optima that interfere with hill-climbing techniques. More recently, Watson and Pollack [15] reviewed the tasks that have been used to study hierarchical structure across a variety of studies from the literature. They concluded that many of the tasks used to investigate such problems do not have the requisite structure to fully investigate the power of evolutionary algorithms using crossover. Their own example, H-IFF (hierarchical if-and-only-if; [16]) has the interesting property of symmetry around diametrically opposed fitness peaks, with many suboptimal peaks and consequently many local optima. The H-IFF function is designed to exhibit hierarchical modularity, particularly recursive modularity (i.e., the same relationship holds between layers of the hierarchy).

This paper considers a visualisation technique for cost surfaces that are of particular relevance to genetic algorithms. The technique is applicable to cost-surfaces that are defined over moderate-dimensional binary spaces ({0, 1}n for n <= 16) and which are hierarchically structured. This visualisation tool exploits the fact that symmetric patterns in low dimensions often indicate recursive structure in higher dimensions. Although humans do not readily visualise high-dimensional structures, we do readily perceive symmetries.

The features of these functions that we would most like to understand concern

  • the ruggedness of each surface, based on the number and distribution of local optima
  • the size and shape of basins of attraction around each local optimum
  • the length of random adaptive walks.
It is possible to use statistical techniques to estimate the properties of such features for some problems (see [4] for application to NK problems), but they require detailed mathematical insight and lack the efficacy of direct visualisation.

The visualisation technique is aimed at providing a lucid description of a cost surface, allowing researchers to understand how the properties of the cost surface interact with the properties of the evolutionary strategy (or other optimisation technique). The visualisation approach is described in Section II, and in Section III we perform a case-study analysis of the H-IFF function [16] using the visualisation tool.


Next: Methods for representing cost Up: Visualisation of Hierarchical Cost Previous: Visualisation of Hierarchical Cost