Reputation: 557
I successfully processed some data on a 4 core computer using parLapply, using a code like below:
require("parallel")
setwd("C:/...")
file_summary<-read.table("file_summary",header=T,sep=" ",stringsAsFactors=F)
new_filename<-sub(".asc",".dat",file_summary$filename)
file_list<-list.files()
myfunction <- function(k) {
x<-M$x[k]
y<-M$y[k]
for (i in 1:length(file_summary[,1])) {
if ( # logical condition on x and y ) {
new_file<-new_filename[i]
new_data<-read.table(new_file,header=T,sep=" ")
eval<-matrix(,nrow=length(new_data$x),ncol=1)
for (j in 1:length(new_data$x)) {
eval[j]<-(new_data$x[j]-x)^2+(new_data$y[j]-y)^2
}
index<-which(eval == max(eval))
out<-c(new_data$x[index],new_data$y[index],new_data$mean[index],new_data$S[index])
}
rm(eval)
gc()
}
return(out)
}
n_tasks <- length(M$x)
n_cores <- 8
Cl = makeCluster(n_cores, type = "PSOCK")
clusterExport(Cl, "M")
clusterExport(Cl, "file_summary")
clusterExport(Cl, "new_filename")
clusterExport(Cl, "file_list")
Results <- parLapply(Cl, c(1:n_tasks), myfunction)
stopCluster(Cl)
and now using exactly the same code and the same data and directory structure (ie. paths), I am trying to run the analysis on a 8 core machine to speed it up further. However, at my first attempt I get the following error:
8 nodes produced errors; first error: cannot open the connection
I tried to clear the RAM a bit (for non R processes) to see if it helped, but it didnt. Any suggestions?
Upvotes: 1
Views: 4965
Reputation: 480
You need to pass the current directory as an argument in parLapply()
to your function. Inside your function myfunction
you need to reset the working directory by setwd()
:
myfunction = function(k, wd_)
{
setwd(wd_)
...
}
...
wd_ = getwd()
Results <- parLapply(Cl, c(1:n_tasks), myfunction, wd_)
N.B. make sure R/R Studio/R Script all are not blocked by the firewall.
Upvotes: 0
Reputation: 19677
The only operation that I see in "myfunction" that can generate a "cannot open the connection" error is "read.table". You might want to add the following right before calling read.table:
if (! file.exists(new_file))
stop(paste(new_file, "does not exist"))
It might also be useful to check that the worker has permission to read the file:
if (file.access(new_file, mode=4) == -1)
stop(paste("no read permission on", new_file))
It seems worthwhile to rule out these problems.
Upvotes: 2