Neodyme
Neodyme

Reputation: 557

cannot open the connection error with parLapply

I successfully processed some data on a 4 core computer using parLapply, using a code like below:

require("parallel")
setwd("C:/...")

file_summary<-read.table("file_summary",header=T,sep=" ",stringsAsFactors=F)
new_filename<-sub(".asc",".dat",file_summary$filename)

file_list<-list.files()

myfunction <- function(k) {

x<-M$x[k]
y<-M$y[k]

for (i in 1:length(file_summary[,1])) {
    if ( # logical condition on x and y ) {
    new_file<-new_filename[i]
    new_data<-read.table(new_file,header=T,sep=" ")
    eval<-matrix(,nrow=length(new_data$x),ncol=1)

      for (j in 1:length(new_data$x)) {
      eval[j]<-(new_data$x[j]-x)^2+(new_data$y[j]-y)^2
    }
    index<-which(eval == max(eval))
    out<-c(new_data$x[index],new_data$y[index],new_data$mean[index],new_data$S[index])
}
rm(eval)
gc()
}
return(out)
}

n_tasks <- length(M$x) 
n_cores <- 8

Cl = makeCluster(n_cores, type = "PSOCK") 
clusterExport(Cl, "M")
clusterExport(Cl, "file_summary")
clusterExport(Cl, "new_filename")
clusterExport(Cl, "file_list")

Results <- parLapply(Cl, c(1:n_tasks), myfunction)

stopCluster(Cl)

and now using exactly the same code and the same data and directory structure (ie. paths), I am trying to run the analysis on a 8 core machine to speed it up further. However, at my first attempt I get the following error:

8 nodes produced errors; first error: cannot open the connection

I tried to clear the RAM a bit (for non R processes) to see if it helped, but it didnt. Any suggestions?

Upvotes: 1

Views: 4965

Answers (2)

MCH
MCH

Reputation: 480

You need to pass the current directory as an argument in parLapply() to your function. Inside your function myfunction you need to reset the working directory by setwd():

myfunction = function(k, wd_)
{
      setwd(wd_)
        ...
}
...
wd_ = getwd()
Results <- parLapply(Cl, c(1:n_tasks), myfunction, wd_)

N.B. make sure R/R Studio/R Script all are not blocked by the firewall.

Upvotes: 0

Steve Weston
Steve Weston

Reputation: 19677

The only operation that I see in "myfunction" that can generate a "cannot open the connection" error is "read.table". You might want to add the following right before calling read.table:

if (! file.exists(new_file))
  stop(paste(new_file, "does not exist"))

It might also be useful to check that the worker has permission to read the file:

if (file.access(new_file, mode=4) == -1)
  stop(paste("no read permission on", new_file))

It seems worthwhile to rule out these problems.

Upvotes: 2

Related Questions