Reputation: 76
I've been playing around with the doRedis package in R to try run some code on a cluster. I've got one Windows machine and one machine running Ubuntu (which is where redis is installed).
I can happily run the example from the doRedis documentation but my goal is to be able to use doRedis in tandem with caret for some machine learning applications. It's my understanding that caret allows for parallelisation and it seems that others have gotten this to work but for the life of me I can't figure out where I'm going wrong.
I found this example and modified it slightly to the following:
library(caret)
library(doRedis)
dat = iris
registerDoRedis("jobs",
host = "xyz")
xgb.grid = expand.grid(nrounds = c(10, 200),
max_depth = c(6),
eta = c(0.05),
gamma = c(0.01),
colsample_bytree = 1,
min_child_weight = 1,
subsample = 1)
ctrl = trainControl(method = 'cv',
number = 10,
verboseIter = F,
allowParallel = T)
set.seed(13)
xgb1 <- train(Species ~ .,
data = dat,
method = "xgbTree",
trControl = ctrl,
verbose = F,
tuneGrid = xgb.grid)
removeQueue("jobs")
This only runs on the local machine, and isn't distributed to the redis queue (and I can see this using doRedis::jobs()
, as well as by running redis-cli --stat
in the Ubuntu terminal, both of which show no jobs being passed to the server).
What am I missing?
Upvotes: 3
Views: 117
Reputation: 2110
Please check out https://topepo.github.io/caret/parallel-processing.html
Relevant quote:
train
,rfe
,sbf
,bag
andavNNet
were given an additional argument in their respective control files calledallowParallel
that defaults to TRUE. When TRUE, the code will be executed in parallel if a parallel backend (e.g. doMC) is registered.
One suggestion to help you debug this is to first try to use redis locally, if that works specify the other server.
Upvotes: 1