Kun Ren
Kun Ren

Reputation: 5033

R: How to source variable files on each parallel cluster node?

I want to run the following code:

library(parallel)

cl <- makeCluster(detectCores())
requires <- c("fUnitRoots","fGarch")

for(req in requires) {
  clusterEvalQ(cl,require(req))
}

list1 <- clusterApply(cl,1:10,function(i) {
  x <- rnorm(100)
  y <- rnorm(100)
  m <- lm(y~x)
  res <- resid(m)
  t <- adfTest(res) ## this function is in {fUnitRoots}
  return(t@test$statistic)
})

stopCluster(cl)

However, fUnitRoots package is not loaded in any node. It is probably because clusterEvalQ(cl,expr) where expr is an expression. require(req) is treated as an expression where req is not regarded as the iterator variable as a character.

How should I refine the code to make it work?

Upvotes: 2

Views: 692

Answers (1)

Steve Weston
Steve Weston

Reputation: 19677

The "character.only" option is useful when calling "require" in this situation. Also, I would use "clusterCall" instead of "clusterEvalQ" to allow the package names to be passed as an argument to a worker function:

clusterCall(cl, function(pkgs) {
  for (req in pkgs) {
    require(req, character.only=TRUE)
  }
}, c("fUnitRoots","fGarch"))

This is also a bit more efficient since it loads all of the packages in a single cluster operation.

You can verify that the packages were correctly loaded using:

clusterEvalQ(cl, search())

Upvotes: 4

Related Questions