8.5 Schemata and the Schema Theorem
The notion of schemata and the Schema theorem provide the theoretical
background why genetic algorithms work so well in practice. The
notion of schemata is fundamental to the following analysis. After
introducing it the Schema theorem will be deduced.
The theoretical background of genetic algorithms is seen most clearly
using the binary string representation and the classic algorithm (cf.
Section 8.4.2). First the wildcard
shall match any single component of a vector. A schema is built of
the alphabet of genes (the fixed components) augmented by the wildcard
symbol [49,77], e.g.,
matches
and if the alphabet is . Hence a schema
represents all strings of the search space which match the schema at
all components except .
The following analysis is performed for the classic genetic
algorithm, whose inner loop is the following.
- .
- Select the new population from the old one by
single string selection based on a positive fitness function.
- Recombine by one point crossover and mutation.
- Evaluate .
- Repeat.
First we give some definitions. Let denote the defining
length of a schema , i.e., the distance between the first and the
last fixed component. It measures how compact the information
contained in a schema is. For example
. Let denote
the order of the schema , i.e., the number of fixed components. For
example
. Let
be the fitness of the schema at time , i.e., the
average fitness of all strings in the population that are matched
by . If matches the individuals
,
then
, where
is the fitness of a single individual . Finally is the
total fitness of the whole population at time , i.e.,
. is the size of the population at time .
In the single string selection step of the classic genetic algorithm
considered, an individual has the probability
of being selected. After the selection step it is expected that
strings match the schema , since matches
individuals, the number of single string selections is , and the
probability that an average string matched by is selected in a
single string selection is
.
Introducing the average fitness of the population
, this can be written as
The interpretation of this formula is that the number of individuals
matching a schema grows in each time step like the ratio of the
fitness of and the total fitness. Thus an above average schema,
i.e., a schema with
, will proliferate
successfully in the next generation, but a below average schema will
not. In the long term the formula implies that above average schemata
will proliferate exponentially.
The next step after selection is recombination, i.e., crossover and
mutation, where new individuals are introduced into the population.
The effects of crossover are considered first. A crossover site is
selected uniformly among possible sites, where is the length
of the vector representing an individual. Hence
is the probability that the schema is destroyed if a certain
vector (or chromosome) undergoes crossover, and thus
holds for the probability of its survival, where is the
probability of crossover. is greater equal, and not equal,
to the right hand side, since it may happen that a schema is conserved
after splitting it in middle if both ancestors happen to contain parts
of the schema at the right places.
Thus taking into account selection and crossover, the number of
individuals matching a schema can be estimated by
The final operator to be considered is mutation. Let be the
probability that a single component is changed. Since mutations are
independent from one another, the probability that a schema
survives a mutation is
Furthermore
, since .
Hence the final estimation of
is
which takes selection, crossover, and mutation in the classic genetic
algorithm into account.
The Schema theorem can thus be stated as follows
[40,77,13].
Theorem 8..1 (Schema Theorem)
The inequality
holds with the above notation. This means that short, low order,
above average schemata receive exponentially increasing trials in
subsequent generations of the classic genetic algorithm and below
average schemata receive exponentially decreasing trials.
Finally it is noted that this inequality was derived for the classic
genetic algorithm described at the beginning of this section and
assuming that the fitness function is positive. A positive function
can, however, always be achieved using a suitable transformation.
This inequality gives insight how genetic algorithms work and why they
can sample big search spaces very fast. Above average schemata
receive exponentially increasing attention and they are combined by
crossover. Furthermore mutation increases the variability of the
population, but the disruptive effective of the crossover and mutation
operators is not significant for short schemata.
Clemens Heitzinger
2003-05-08