Reputation: 2057
I'm working on pipeline network optimization, and I'm representing the chromosomes as a string of numbers as following
example
chromosome [1] = 3 4 7 2 8 9 6 5
where, each number refers to well and the distance between wells are defined. since, the wells cannot be duplicated for one chromosome. for example
chromosome [1]' = 3 4 7 2 7 9 6 5 (not acceptable)
what is the best mutation that can deal with a representation like that? thanks in advance.
Upvotes: 0
Views: 99
Reputation: 8606
Can't say "best" but one model that I've used for graph-like problems is: For each node (well number), calculate the set of adjacent nodes / wells from the entire population. e.g.,
population = [[1,2,3,4], [1,2,3,5], [1,2,3,6], [1,2,6,5], [1,2,6,7]]
adjacencies = {
1 : [2] , #In the entire population, 1 is always only near 2
2 : [1, 3, 6] , #2 is adjacent to 1, 3, and 6 in various individuals
3 : [2, 4, 5, 6], #...etc...
4 : [3] ,
5 : [3, 6] ,
6 : [3, 2, 5, 7],
7 : [6]
}
choose_from_subset = [1,2,3,4,5,6,7] #At first, entire population
Then create a new individual / network by:
choose_next_individual(adjacencies, choose_from_subset) :
Sort adjacencies by the size of their associated sets
From the choices in choose_from_subset, choose the well with the highest number of adjacent possibilities (e.g., either 3 or 6, both of which have 4 possibilities)
If there is a tie (as there is with 3 and 6), choose among them randomly (let's say "3")
Place the chosen well as the next element of the individual / network ([3])
fewerAdjacencies = Remove the chosen well from the set of adjacencies (see below)
new_choose_from_subset = adjacencies to your just-chosen well (i.e., 3 : [2,4,5,6])
Recurse -- choose_next_individual(fewerAdjacencies, new_choose_from_subset)
The idea is that nodes with high numbers of adjacencies are ripe for recombination (since the population hasn't converged on, e.g., 1->2), a lower "adjacency count" (but non-zero) implies convergence, and a zero adjacency count is (basically) a mutation.
Just to show a sample run ..
#Recurse: After removing "3" from the population
new_graph = [3]
new_choose_from_subset = [2,4,5,6] #from 3 : [2,4,5,6]
adjacencies = {
1: [2]
2: [1, 6] ,
4: [] ,
5: [6] ,
6: [2, 5, 7] ,
7: [6]
}
#Recurse: "6" has most adjacencies in new_choose_from_subset, so choose and remove
new_graph = [3, 6]
new_choose_from_subset = [2, 5,7]
adjacencies = {
1: [2]
2: [1] ,
4: [] ,
5: [] ,
7: []
}
#Recurse: Amongst [2,5,7], 2 has the most adjacencies
new_graph = [3, 6, 2]
new_choose_from_subset = [1]
adjacencies = {
1: []
4: [] ,
5: [] ,
7: []
]
#new_choose_from_subset contains only 1, so that's your next...
new_graph = [3,6,2,1]
new_choose_from_subset = []
adjacencies = {
4: [] ,
5: [] ,
7: []
]
#From here on out, you'd be choosing randomly between the rest, so you might end up with:
new_graph = [3, 6, 2, 1, 5, 7, 4]
Sanity-check? 3->6
occurs 1x in original, 6->2
appears 2x, 2->1
appears 5x, 1->5
appears 0, 5->7
appears 0, 7->4
appears 0. So you've preserved the most-common adjacency (2->1) and two other "perhaps significant" adjacencies. Otherwise, you're trying out new adjacencies in the solution space.
UPDATE: Originally I'd forgotten the critical point that when recursing, you choose the most-connected to the just-chosen node. That's critical to preserving high-fitness chains! I've updated the description.
Upvotes: 1