Reputation: 508
I have a really big (exponential) search space for a problem, and as such i've been attempting to use genetics/evolution to find the approximate best solution, however I'm extremely inexperienced on the field and the problem has a lot of constraints. To be honest i'm not even sure if genetics is the right approach for this problem (though I lack knowledge about other methods).
I managed to implement a basic system that can generate a valid solution using Jenetics, which I will present below along with my questions, however due to the constraints presented below the performance (fitness) of the solution is horrible, since the system generates huge amounts of invalid genes.
While I have tagged the question with Jenetics, I have no requirements to use the Jenetics library, however it looks like the best option off of those I've found.
I have the following "gene" (if that's even the word!) (pseudo-code):
class SomeGene {
top: Chromosome;
middle: Chromosome;
bottom_left: Chromosome;
bottom_right: Chromosome;
}
Where Chromosome
can be any of 90+
(ie: a large number, pulled from a database) of possibilities, however for a gene to be valid at all, there are several constraints of which chromosome can be on which attribute. This chromosome is a class with several attributes, some of which would do to be chromosomes themselves as well due to how we calculate the fitness, more on that in a few.
These are the constraints for a valid gene:
90
existing chromosomes, only 8
of them can be in top
, only 11
can be on middle, etc... If a top
has a chromosome that isn't part of those 8
then the entire gene is invalid. (ie: only chromosomes of type top
can be on top
).bottom_left
and bottom_right
are a special case, they can both have chromosomes that go in bottom
and both values can be equal (ie: bottom_left
and bottom_right
are the same chromosome) however some chromosomes of the bottom
type are special and they can only appear once in a gene, if they appear twice in a gene the entire gene is invalid.To make matters worse, the Chromosomes themselves have a field that can vary (pseudo-code):
class Chromosome {
type: 'TOP' | 'BOTTOM' | 'MIDDLE';
is_unique: boolean;
max_attributes: int; // maximum attributes (from 0 to 5), depends on the chromosome
attributes: List<Attribute> // a list of attributes, always has $max_attributes entries
}
This attribute
type would look like this (pseudo-code):
class Attribute {
type: 'A' | 'B' | 'C';
value: number;
}
There are about ~12 different attribute types, so for every possible attribute type I currently make a new chromosome with that variation. This means that if a chromosome has max_attributes
of 2
then I duplicate it so that all possible variations of the 2
attributes exist in the search space (ex: a single chromosome that has max_attributes = 2
would become several so that I have chromosomes with all possible values for attributes = [A, A] || [A, B] || [A, C] || [B, B] || [B, C] || [C, C]
). Attributes can repeat a shown, however the order doesn't matter.
When finding which attribute types can go into a chromosome, I can somewhat filter it down to some amount (ex: chromosomes that can have 2 max attributes will only have 12 possibilities in those two slots, while ones that can have 5 max attributes will have 12 in the first two slots and a different (slightly bigger) set of possibilities for the other 3 slots, etc..).
Finally, the fitness function takes the entire Gene
and sums values from all attributes of all chromosomes, then runs some complex math that evalutes the fitness of the entire gene. It's not possible to evaluate the fitness of a single chromosome by itself since depending on the value of other chromosomes in the gene, the fitness would vary (ie: a chromosome may be less effective if another specific chromosome type is present in the gene).
These are my questions right now:
top
chromosomes.I'm currently running the most basic Jenetics setup, from their own website as i've been extremely confused trying to navigate the library.
This is how I run the simulation:
public void run() {
// 1.) Define a genotype (factory) suitable
// for the problem.
// 4 chromosomes (one for each slot: top, middle, bottom_left, bottom_right)
Factory<Genotype<AnyGene<Chromosome>>> gtf =
Genotype.of(AnyChromosome.of(this::getRandomChromosome, 4));
// 2.) Create the evaluation/fitness environment.
Engine<AnyGene<Chromosome>, Float> engine = Engine
.builder(this::fitness, gtf)
.build();
// 4.) Start the execution (evolution) and
// collect the best result found.
Genotype<AnyGene<Chromosome>> best = engine.stream()
.limit(100_000) // number of generations to run?
.collect(EvolutionResult.toBestGenotype());
// Print the details of the best result
this.printBest(best);
}
This is how a random chromosome is generated:
private Chromosome getRandomChromosome() {
Random random = new Random();
// Fetch a random chromosome base (90 possibilities)
ChromosomeBase base = this.bases.get(random.nextInt(this.bases.size()));
List<Attribute> attached = new ArrayList<>();
// Filter down from all possible attributes to only ones that make sense on this base
List<Attribute> validForThisOne = this.all_attributes
.stream().filter(x -> x.tier() % 2 == 0)
.toList();
// Insert a random assortment of valid attributes
for (int i = 0; i < base.getMaxAttributes(); i++) {
var attribute = validForThisOne.get(random.nextInt(validForThisOne.size()));
attached.add(attribute);
}
// Return a chromosome that is valid by itself
return new Chromosome(base.getId(), base.getType(), base.getMaxAttributes(), attached);
}
Finally, this is how the fitness function currently looks:
private float fitness(Genotype<AnyGene<Chromosome>> gt) {
List<Chromosome> items = gt.chromosome()
.stream()
.map(AnyGene::allele)
.toList();
Map<Type, Chromosome> build = new HashMap<>();
for (Chromosome chromo : items) {
Type type = ChromoTypes.parse(chromo.type());
if (build.containsKey(type)) {
// This entire gene failed/died because it had duplicate types! ex: two top chromosomes!
return 0;
}
build.put(chromo, item);
}
// If we end up with less than 4 chromosome types, then we are missing chromosomes, so this gene is also invalid
if (build.size() < 4) {
return 0;
}
// Calculate the finess for our gene, this will always return a positive value, the bigger the better
float fitness = this.doABunchOfMath(build);
return fitness;
}
Upvotes: -1
Views: 89
Reputation: 854
I'm also a little bit confused with your description :), so I can only give you an advice for your last point: How to break things down?
I would suggest to start modelling the problem in your native problem space/domain, without thinking about GA (Jenetics) and its representations. What are the data structures your fitness function needs? Don't think about Jenetics at this point. This native representation should contain as less constraints as possible. Choose a model that only needs a minimal set of constraint, which might be not easy, I guess.
Once you have found a good native domain model, you can start thinking about a Genotype
representation of this model. Jenetics has the concept of a Codec
for this purpose. Finding a proper Codec
, with a minimal set of constraints, will also not be easy.
For the remaining constraints, have a look at the Constraint
interface. It only gives you minimal support for constraint handling, but it might be helpful in your case.
Giving support for finding a good Codec
, I would need more information of your native model and it's constraints.
Regards Franz
Upvotes: 0