Laggel
Laggel

Reputation: 1386

Why do Genetics Algorithms evaluate the same genome several times?

I'm trying to solve the DARP Problem, using a Genetic Algorithm DLL.

Thing is that eventhough it sometimes comes up with the right solution, others times it does not. Although Im using a really simple version of the problem. When I checked the genomes the algorithm evaluated I found out that it evaluates several times the same genome.

Why does it evaluate the same several time? wouldn't be more efficient if it does not?

Upvotes: 0

Views: 317

Answers (5)

user571138
user571138

Reputation:

Elitism aside, The GA does not evaluate the same genome repeatedly. What you are seeing is identical genomes being regenerated and reevaluated. This is because each generation is a new set of genomes, which may or may not have been evaluated before.

To avoid the reevaluation you would need to keep a list of already produced genomes, with their fitness. To access the fitness you would need to have compare each of your new population with the list, when it is not in the list you would need to evaluate it, and add it to the list.

As real world applications have thousands of parameters, you end up with millions of stored genomes. This then becomes massively expensive to search & maintain. So probably quicker to just evaluate the genome each time.

Upvotes: 0

Novak
Novak

Reputation: 4779

There is a difference between evaluating the same chromosome twice, and using the same chromosome in a population (or different populations) more than once. The first can probably be usefully avoided; the second, maybe not.

Consider:

In some generation G1, you mate 0011 and 1100, cross them right down the middle, get lucky and fail to mutate any of the genes, and end up with 0000 and 1111. You evalate them, stick them back into the population for the next generation, and continue the algorithm.

Then in some later generation G2, you mate 0111 and 1001 at the first index and (again, ignoring mutation) end up with 1111 and 0001. One of those has already been evaluated, so why evaluate it again? Especially if evaluating that function is very expensive, it may very well be better to keep a hash table (or some such) to store the results of previous evaluations, if you can afford the memory.

But! Just because you've already generated a value for a chromosome doesn't mean you shouldn't include it naturally in the results going forward, allowing it to either mutate further or allowing it to mate with other members of the same population. If you don't do that, you're effectively punishing success, which is exactly the opposite of what you want to do. If a chromosome persists or reappears from generation to generation, that is a strong sign that it is a good solution, and a well-chosen mutation operator will act to drive it out if it is a local maximum instead of a global maximum.

Upvotes: 1

rybo103
rybo103

Reputation: 335

Depending on the particulars of your GA, you may have the same genomes in successive populations. For example Elitism saves the best or n best genomes from each population.

Reevaluating genomes is inefficient from a computational standpoint. The easiest way to avoid this is to put a boolean HasFitness flag for each genome. You could also create a unique string key for each genome encoding and store all the fitness values in a dictionary. This lookup can get very expensive, so this is only recommended if your fitness function is expensive enough to warrant the added expense of the lookup.

Upvotes: 0

Larry OBrien
Larry OBrien

Reputation: 8606

The basic explanation for why a GA might evaluate the same individual is precisely because it is non-guided, so the recreation of a previously-seen (or simultaneously-seen) genome is something to be expected.

More usefully, your question could be interpreted as about two different things:

  1. A high cost associated with evaluating the fitness function, which could be mitigated by hashing the genome, but trading off memory. This is conceivable, but I've never seen it. Usually GAs search high-dimensional-spaces so you'd end up with a very sparse hash.

  2. A population in which many or all members have converged to a single or few patterns: at some point, the diversity of your genome will tend towards 0. This is the expected outcome as the algorithm has converged upon the best solution that it has found. If this happens too early, with mediocre results, it indicates that you are stuck in a local minimum and you have lost diversity too early.

In this situation, study the output to determine which of the two scenarios has happened:

  1. You lose diversity because particularly-fit individuals win a very large percentage of the parental lotteries. Or,
  2. You don't gain diversity because the population over time is quite static.

In the former case, you have to maintain diversity. Ensure that less-fit individuals get more chances, perhaps by decreasing turnover or scaling the fitness function.

In the latter case, you have to increase diversity. Ensure that you inject more randomness into the population, perhaps by increasing mutation rate or increasing turnover.

(In both cases, of course, you ultimately do want diversity to decrease as the GA converges in the solution space. So you don't just want to "turn up all the dials" and devolve into a Monte Carlo search.)

Upvotes: 1

Choufler
Choufler

Reputation: 460

Basicly genetic algoritm consists of

  • initial population (size N)
  • fitness function
  • mutation operation
  • crossover operation (performed usually on 2 individuals by taking some parts of their genome and combining it into new individual)

At every steps it

  1. choses random individuals
  2. performs crossover resulting in new individuals
  3. possibly perform mutation(change random gene in random individual)
  4. evaluate all old and new individuals with fitness function
  5. choose N best fitted to be a new population on next iteration

The algorythm stops when it reaches a threshold of fitness function, or if there is no changes in population in last K iterations.

So, it could stop not at the best solution, but at local maximum.

A part of the population could stay unchanged from one iteration to another, because they could have a good value of fitness function. Also it is possible to "fall back" to previos genomes because of mutation.

There are a lot of tricks to make genetic algorytm work better : choosing appropriate population encoding into genom, finding a good fitness function, playing with crossover and mutation ratio.

Upvotes: 0

Related Questions