Arnaud
Arnaud

Reputation: 387

R file connection when using parallel

I am trying to make a file connection within a cluster (using parallel). While it works correctly in the global environment, it gives me an error message when used within the members of the cluster (See the script below). Do I missed something?

Any suggestion?

Thanks,

# This part works
#----------------
cat("This is a test file" , file={f <- tempfile()})
con <- file(f, "rt")


# Doing what I think is the same thing gives an error message when executed in parallel
#--------------------------------------------------------------------------------------

library(parallel)
cl <- makeCluster(2)

## Exporting the object f into the cluster

clusterExport(cl, "f")
clusterEvalQ(cl[1], con <- file(f[[1]], "rt"))
 #Error in checkForRemoteErrors(lapply(cl, recvResult)) :
 # one node produced an error: cannot open the connection


## Creating the object f into the cluster

clusterEvalQ(cl[1],cat("This is a test file" , file={f <- tempfile()}))
clusterEvalQ(cl[1],con <- file(f, "rt"))
 #Error in checkForRemoteErrors(lapply(cl, recvResult)) :
 # one node produced an error: cannot open the connection 


############ Here is my sessionInfo() ###################
# R version 3.3.0 (2016-05-03)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows 7 x64 (build 7601) Service Pack 1
#
# locale:
# [1] LC_COLLATE=French_Canada.1252  LC_CTYPE=French_Canada.1252   
# [3] LC_MONETARY=French_Canada.1252 LC_NUMERIC=C                  
# [5] LC_TIME=French_Canada.1252    
#
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base 
# 

Upvotes: 2

Views: 616

Answers (1)

Steve Weston
Steve Weston

Reputation: 19677

Try changing the code to return a NULL rather than the created connection object:

clusterEvalQ(cl[1], {con <- file(f[[1]], "rt"); NULL})

Connection objects can't be safely sent between the master and workers, but this method avoids that.

Upvotes: 1

Related Questions