Reputation: 1961
Where can I find some good tutorials with examples with free data, on how to implement metaheuristics algorithms in R ?
I am asking this because I found lots of resources on how to do it, however I am facing big problems on moving from the theory to implement it.
The Book Essentials of Metaheuristics (by Professor Sean Luke) is a great book to start, however for people with a limited programming background and no experience with algorithms, it's hard to implement them without some "real examples" with data, etc
Picking up an example from the Book Essentials of Metaheuristics (Page 16):
Algorithm 5 Steepest Ascent Hill-Climbing
1: n ← number of tweaks desired to sample the gradient
2: S ← some initial candidate solution
3: repeat
4: R ← Tweak(Copy(S))
5: for n − 1 times do
6: W ← Tweak(Copy(S))
7: if Quality(W) > Quality(R) then
8: R ← W
9: if Quality(R) > Quality(S) then
10: S ← R
11: until S is the ideal solution or we have run out of time
12: return S
I would like to have something that would after give me an example using real data. I am looking for something like this.
I've seen many questions concerning specific algorithms (like GA) and maybe I am duplicating questions that already exist but I did not found this question in particular, but if this is duplicated please warn me.
Other languages like python would also help (e.g. any language similar to R).
Upvotes: 3
Views: 1766
Reputation: 57686
I'm not familiar with metaheuristics as a field, but the pseudocode as you've given it actually translates fairly easily into R syntax:
# I never metaheuristic I didn't like
metah <- function(S, quality, tweak, n, outer.limit, threshold)
{
outer.n <- 0
repeat {
outer.n <- outer.n + 1
R <- tweak(S)
for(i in seq_len(n - 1))
{
W <- tweak(S)
if(quality(W) > quality(R))
R <- W
}
if(quality(R) > quality(S))
S <- R
if(quality(S) >= threshold || outer.n >= outer.limit)
break
}
S
}
Now all you have to do is provide suitable functions for quality
and tweak
.
For example, suppose we want to fit a linear regression. In this case, we have a vector of responses y
, and a matrix of vectors X
. The solution S
would be the vector of candidate coefficients at each step, and the "quality" is the squared error loss: sum((y - yhat)^2)
. Note that here, the lower the quality, the better.
For tweak
, we might use a normal distribution of perturbations from the current solution S
, with a user-specified covariance matrix.
This can then be coded up as
require(MASS) # for mvrnorm
quality <- function(S, y, X)
sum((y - X %*% S)^2)
tweak <- function(S, sigma=rep(1, length(s))
S + mvrnorm(length(S), 0, sigma)
metah <- function(y, X, quality, tweak, n, outer.limit, threshold)
{
outer.n <- 0
S <- rep(1, ncol(X))
repeat {
outer.n <- outer.n + 1
R <- tweak(S)
for(i in seq_len(n - 1))
{
W <- tweak(S)
if(quality(W, y, X) < quality(R, y, X)) # note reversed comparison!
R <- W
}
if(quality(R, y, X) < quality(S, y, X))
S <- R
if(quality(S) <= threshold || outer.n >= outer.limit)
break
}
S
}
Further improvements might be:
Replace the inner loop for(i in ...)
with vectorised code using *apply
let the distribution of tweaks vary depending on the characteristics of the solution, instead of hard-coding it as above (in particular, sigma
should vary based on the scale of your X variables)
express threshold
in terms of your progress toward a minimum, for example how far each candidate solution has moved from the previous iteration.
Upvotes: 5