Proportional Selection (Roulette-Wheel) in Genetic Algorithms



Some of you know that I'm maintaining PGApack, a genetic algorithm package and a corresponding Python wrapper, PGApy. Recently the question came up why PGApack, when using Fitness Proportionate Selection (also known as Roulette-Wheel selection and also called proportional selection) errors out because it cannot normalize the fitness value.

In PGApack, the user supplies a real-valued evaluation function (which is called for each individual) and specifies if the result of that evaluation should be minimized or maximized (defining the optimization direction).

When using Fitness proportionate selection, this evaluation value must be mapped to a positive value, assigning each evaluation a share on the roulette-wheel. If the optimization direction is minimization, small values are better and need a larger part of the roulette-wheel so that they have a higher probability of being selected. So we need to remap the raw evaluation values to a nonnegative and monotonically increasing fitness. For a minimization problem, we're computing the maximum (worst) evaluation and then the fitness is the difference of this maximum and the evaluation for that individual (after scaling the maximum a little so that no fitness value is exactly zero):

\begin{equation*} F = E_{max} - E \end{equation*}

where \(F\) is the fitness of the current individual, \(E_{max}\) is the maximum of all evaluations in this generation, and \(E\) is the evaluation of that individual.

Now when evaluation values differ by several orders of magnitude, it can happen that the difference in that formula ends up being \(E_{max}\) for many (different) evaluation values. I'm calling this an overflow in the error message which is probably not the best name for it.

That overflow happens when \(E_{max}\) is large compared to the current evaluation value \(E\) so that the difference ends up being \(E_{max}\) (i.e. the subtraction of \(E\) had no effect). In the code we check for this condition:

if (cmax == cmax - evalue && evalue != 0)

This condition triggers when subtracting the evaluation \(E\) from the current \(E_{max}\) does not change \(E_{max}\) even though \(E\) is not zero. So \(E\) is so small compared to \(E_{max}\) that the double data type cannot represent the difference. This happens whenever the units in the last place (called ulps by Goldberg (not the same Goldberg who wrote the Genetic Algorithms book afaik)) of the significand (also called mantissa) is larger than the value that should be subtracted [1].

In our example \(E_{max} = 1.077688 * 10^{22}\) and the evaluation where this failed was \(E = 10000\). The IEEE 754 double precision floating point format has 53 bit of significand which can represent numbers up to \(2^{54} - 1 = 18014398509481983\) or about \(1.8 * 10^{16}\). So you see that the number 10000 is just below the ulps. We can try in python (which uses double for floating-point values):

>>> 1.077688e22 - 10000 == 1.077688e22
True

Why do we make this check in the program? Letting the search continue with such an overflow (or how we want to call it) would map many different evaluation values to the same fitness. So the genetic algorithm could not distinguishing these individuals.

So what can we do about it when that error happens?

The short answer: Use a different selection scheme. There is a reason why the default in PGApack is not proportional selection.

Fitness proportionate selection (aka roulette wheel selection) has other problems, too. It has too much selection pressure in the beginning and too low at the end (also mentioned in the Wikipedia article but beware, I've written parts of it).

Blickle and Thiele [2] did a mathematical analysis of different selection schemes and showed that proportional selection is typically not a good idea (it was historically the first selection scheme and is described in Goldberg's (the other Goldberg) book [3] which is probably the reason it is still being used). Note that there is an earlier report from Blickle and Thiele [4] that is more frank about the use of proportional selection: "All the undesired properties together led us to the conclusion that proportional selection is a very unsuited selection scheme. Informally one can say that the only advantage of proportional selection is that it is so difficult to prove the disadvantages" ([4], p. 42), they were not as outspoken in the final paper :-)

We're seeing this in the example above: We have very high differences between good and bad evaluation (in fact so large that the fitness cannot be computed, see above). So when using proportional selection the very good individuals will be selected with too high probability resulting in premature convergence.

That all said: If you're doing optimization with genes of type 'real', (represented by double values in PGApack) you may want to try Differential Evolution [5], [6], [7]. At least in my experiments with antenna optimization [8] the results outperform the standard genetic algorithm, but this is reported by several practitioners [7]. Examples of how to use this are in examples/deb/optimize.c or examples/mgh/testprogde.c in PGApack.

The PGASetDECrossoverProb is critical, for problems where the dimensions cannot typically be optimized separately the value should be set close to 1 but not equal to 1.