Reputation: 1537
I use genetic algorithms in my research a lot and i ran across an interesting question about how best to perform your genetic operations on a genome. Say you have a function defined by f(x,y) = ax^n + bx^n-1 + ... + cy^m + dy^m-1 ... etc. its just a multivariable function that is somewhat expensive to calculate so you're trying to be as efficient as possible with your genetic operations.
If you're using a binary representation of the genome i have figured out that there are 2 reasonable ways to perform the genetic operations. Lets look at just the crossover stage.
Here is the code for a vectorized tournament selection in Matlab (for context to variable names)
%% Tournament Selection
T = round(rand(2*popSize,S)*(popSize-1)+1); % Tournaments
[~,idx] = max(F(T),[],2); % Index of Winners
W = T(sub2ind(size(T),(1:2*popSize)',idx)); % Winners
So you have 2 different variables that are being optimized and my question is do you want to split up the genetic operations so that you have a crossover applied to each variable individually and then concatenate the arrays back together which would look like this for a 2 point crossover:
%% 2 Point Crossover
Pop2 = Pop(W(1:2:end),:); % Set Pop2 = Pop Winners 1
P2A = Pop(W(2:2:end),:); % Assemble Pop2 Winners 2
% Split Pop2 for x and y genomes
xPop2 = Pop2(:,1:genome/2);
yPop2 = Pop2(:,genome/2 + 1:end);
% Split P2A for x and y genomes
xP2A = P2A(:,1:genome/2);
yP2A = P2A(:,genome/2+2:end);
% For x genome
Ref = ones(popSize,1)*(1:genome/2); % Reference Matrix
CP = sort(round(rand(popSize,2)*(genome/2-1)+1),2); % Crossover Points
xidx = CP(:,1)*ones(1,genome/2)<Ref & CP(:,2)*ones(1,genome/2)>Ref; % Logical Index
xPop2(xidx) = xP2A(xidx); % Recombine Winners
% For y genome
Ref = ones(popSize,1)*(1:genome/2); % Reference Matrix
CP = sort(round(rand(popSize,2)*(genome/2-1)+1),2); % Crossover Points
yidx = CP(:,1)*ones(1,genome/2)<Ref & CP(:,2)*ones(1,genome/2)>Ref; % Logical Index
yPop2(yidx) = yP2A(yidx); % Recombine Winners
Pop2 = horzcat(xPop2,yPop2);
P2A = horzcat(xP2A,yP2A);
or do you treat the genome as a single crossover operation and just perform the 2 point crossover as if it were only a single variable genome like this:
Pop2 = Pop(W(1:2:end),:); % New Pop is Winners of old Pop
P2A = Pop(W(2:2:end),:); % Assemble Pop2 Winners 2
Ref = ones(popSize,1)*(1:genome); % Ones Matrix
CP = sort(round(rand(popSize,2)*(genome-1)+1),2); % Crossover Points
idx = CP(:,1)*ones(1,genome)<Ref&CP(:,2)*ones(1,genome)>Ref; % Index
Pop2(idx)=P2A(idx); % Recombine Winners
Does anyone know of any research that has been done that shows the difference in the 2 different ways to represent the genomes? I haven't been able to find anything published about it, but it may just be because i didn't know how to intelligently word my question in google.
Thanks
Upvotes: 3
Views: 101
Reputation: 1622
My main comment are as follows:
The mutation stage strongly depends on the problem type.
For continuous, non linear, dynamical problems, such as the Weather Meteorological Physical problem you cited, with estimation and prediction processes, still a GA framework could perform very well, if it is stated properly and always with a clear focus on what you are pursuing with him. A discrete mutation strategy -such as the single selective crossover presented- would perform really poor, just like a standard nonguided, random, bruteforce search. As you said, indeed this is the naivest approach. The performance & execution time, will express this fact in a nondoubtly manner.
Having clear the problem is physical -and i am assuming just too much-, and thus, given there is a dynamical system, probably with a certain number of states, finite or not, and given a certain continuity on the variables, and with or without some non-linearity degree, the main GA focus here should involve Bayesian Estimators. Under this context, each winner or individuals -or any other descriptive word you would wish to apply- would have a mutation stage not based on a simple exchange of parameters. This is just the simplest, easier, shinier explanation for GA. The better, faster, efficient way to interpret the mutation stage on this context equals to apply an statistical estimator able to gather the proper gradient from this dynamical nonlinear system, estimated at each stage. Under a Bayesian GA framework, Particle Filters are an interesting alternative for preparing a good mutation stage.
The fact that you are using multiple, nested integrals, enlighten me on thinking your problem is truly dynamical. And under that context, guiding the GA through a more statistical bayesian framework makes me a lot of sense. Remember that Bayesian methods are also robust under certain conditions.
If the problem is plainly dynamical, you should then face the truth, and considering discard GA, and moving to an Non-Linear Optimization techniques. Even, Parameter Estimation, and System Identification methods would even perform better, if the problem can be suitably expressed as a parameter search process instead.
By other way, if your problem is in a no distinguishable way discrete -that is, if you do not have PDE expressions of any kind, or if these PDEs are really trivial to solve, or if there is plainly not any physical process involving states- and your problem is mainly based on expert rules, databases lookups, or processing large data systems and you have discarded -by any reason you find pertinent- to not have a dynamical model from these data- then, and only then, i would try to guide the GA as a brute force method, with random crossover stages between the better solutions. A Discrete Bayesian framework would even be useful here, from the probability estimates for every mutation stage.
If you have more information, i could make this answer evolve
....
Best wishes, and luck hypfco
Upvotes: 2