Reputation: 4659
I am using R's base parallel library where within a parLapply
a script is sourced. In one case if I place the code inline, results are as expected. In another case if I replace the code with a call to source()
pointing to a script with the exact same code, the code fails.
A reproducible example:
require(parallel)
# generate a list of random vectors with increasing means
set.seed(1)
x <- lapply(1:4, function(i) rnorm(10,i,1))
# create cluster and export the above list
cl <- makePSOCKcluster(4)
clusterExport(cl, varlist=c("x"))
# use inline code first
means.inline <- parLapply(cl, 1:length(x), function(i) {
values <- x[[i]]
mean(values)
})
# now call the exact same code, but sourced from a separate script
means.source <- parLapply(cl, 1:length(x), function(i) {
source("code.R")
})
stopCluster(cl)
The contents of code.R
are simply the same code in the first parLapply:
values <- x[[i]]
mean(values)
The first parLapply
executes and calculates means as expected. The second parLapply
fails:
Error in checkForRemoteErrors(val) :
4 nodes produced errors; first error: object 'i' not found
Upvotes: 1
Views: 1277
Reputation: 4659
From ?source
:
source causes R to accept its input from the named file or URL or connection. Input is read and parsed from that file until the end of the file is reached, then the parsed expressions are evaluated sequentially in the chosen environment.
The clue is in the mention of "chosen environment" and then a peek at the local
argument for source
:
TRUE, FALSE or an environment, determining where the parsed expressions are evaluated. FALSE (the default) corresponds to the user's workspace (the global environment) and TRUE to the environment from which source is called.
This means that source(code.R)
results in the script being read and parsed in the global environment by default, not the individual function environments within parLapply
.
To get the required behavior:
source("code.R", local=TRUE)
Upvotes: 2