Charlie
Charlie

Reputation: 2851

parSapply not finding objects in global environment

I am trying to run code on several cores (I tried both the snow and parallel packages). I have

cl <- makeCluster(2)
y  <- 1:10
sapply(1:5, function(x) x + y)  # Works
parSapply(cl, 1:5, function(x) x + y)

The last line returns the error:

Error in checkForRemoteErrors(val) : 
  2 nodes produced errors; first error: object 'y' not found

Clearly parSapply isn't finding y in the global environment. Any ways to get around this? Thanks.

Upvotes: 26

Views: 7246

Answers (2)

Steve Weston
Steve Weston

Reputation: 19677

It is worth mentioning that your example will work if parSapply is called from within a function, although the real issue is where the function function(x) x + y is created. For example, the following code works correctly:

library(parallel)
fun <- function(cl, y) {
  parSapply(cl, 1:5, function(x) x + y)
}
cl <- makeCluster(2)
fun(cl, 1:10)
stopCluster(cl)

This is because functions that are created in other functions are serialized along with the local environment in which they were created, while functions created from the global environment are not serialized along with the global environment. This can be useful at times, but it can also lead to a variety a problems if you're not aware of the issue.

Upvotes: 8

Joshua Ulrich
Joshua Ulrich

Reputation: 176698

The nodes don't know about the y in the global environment on the master. You need to tell them somehow.

library(parallel)
cl <- makeCluster(2)
y  <- 1:10
# add y to function definition and parSapply call
parSapply(cl, 1:5, function(x,y) x + y, y)
# export y to the global environment of each node
# then call your original code
clusterExport(cl, "y")
parSapply(cl, 1:5, function(x) x + y)

Upvotes: 25

Related Questions