problème0123
problème0123

Reputation: 871

foreach %dopar% write in a same file

I would like to write in parallele in the same file, I tested two methods, writeLine and sink.

path_file <- "path"
cl <- makeCluster(4)
registerDoParallel(cl)
#fileConn<-file(path_file, "w")
sink(path_file, append=TRUE)
foreach(i = 1:3) %dopar% 
{
  #writeLines(text = paste("hellow","world", i), con = fileConn, sep="\n")
  cat(paste("hellow","world", '\n'), file = path_file, append=TRUE)
}
sink()
#close(fileConn)
parallel::stopCluster(cl)

With writeLine method i get a error: Error in { : task 1 failed - "incorrect connection"

With sink method i have in file result of foreach [[ ]] NULL and i don't want this.

hellow world 1 
hellow world 2 
hellow world 3 
[[1]]
NULL

[[2]]
NULL

[[3]]
NULL

Upvotes: 3

Views: 1510

Answers (1)

Carlos Santillan
Carlos Santillan

Reputation: 1087

Another alternative is to redirect all output to a file (this may not be what you want)

  library(doParallel)
  library(flock)
  path_file <- "path1.txt"
  cl <- makeCluster(4,outfile=path_file)
  registerDoParallel(cl)
  foreach(i = 1:10) %dopar% 
  {
    message <- paste("hello","world", i,"\n")
    print(message)
  }
  parallel::stopCluster(cl)

or you may want to have a file for each element and then concat them

  library(doParallel)
  library(flock)
  path_file <- "path"

  cl <- makeCluster(4)
  registerDoParallel(cl)
  foreach(i = 1:103, .export ="fileConn") %dopar% 
  {
    filename = paste0(path_file,i,".txt")
    message <- paste("hello","world", i,"\n")
    print(filename)
    cat(message, file = filename, append=TRUE)
    print(message)
  }

  parallel::stopCluster(cl)

  startfile= "full.txt"
  foreach(i = 1:103, .export ="fileConn") %do% 
  {
    filename = paste0(path_file,i,".txt")
    file.append(startfile,filename)
    file.remove(filename)
  }

You need to be careful when multiple threads are trying to access the same resource. in order to synchronise the access to a shared resource you can use the flock package to set mutex. (not sure why the following is not working, file connection may not be threadable

Take a look at the following code sample

  library(doParallel)
  library(flock)
  path_file <- "path12.txt"
  fileConn<-file(path_file,open = "a")
  lock <-tempfile()

  cl <- makeCluster(4)
  registerDoParallel(cl)
  foreach(i = 1:103) %do% 
  {
    locked <- flock::lock(lock)  # Lock in order to use shared resources
    message <- paste("hello","world", i,"\n")
    cat(message, file = fileConn, append=TRUE)
    print(message)
    flock::unlock(locked)  # Release lock
    print(message)
  }

  close(fileConn)
  parallel::stopCluster(cl)

Upvotes: 2

Related Questions