Mark Sale
Mark Sale

Reputation: 65

ClusterFuture with snow blocks

I'm trying to get a long running, embarrassing parallel analysis using snow and future to run asynchronously. But, the ClusterFuture is blocking, simplified code below. Is there a way to keep ClusterFuture from blocking? Or am I just doing something wrong? Running R version 3.5.3 on 64 bit Windows (and eventually on Linux as well).

thanks Mark

tried just lapply with snow, and just future. ClusterFuture with parLapply works really nicely, execution time exactly what it should be (8 times as fast). But, it blocks and I'd really like it to behave like a regular future (and return control to the console).

rm(list=ls())
RunNM2 <- function(index){
  Sys.sleep(4)
  return(index)
}
library(tictoc)
library(future)
library(snow)
cl <- future::makeClusterPSOCK(rep("localhost",8),makeNode =         
 makeNodePSOCK)
plan(cluster, workers = cl)
tic("cluster")
res.1 <- ClusterFuture(parLapply(cl,1:8,RunNM2),worker=cl )
##blocks here
res <- value(res.1)
toc()
stopCluster(cl)
rm(cl)

Upvotes: 1

Views: 47

Answers (1)

Ralf Stubner
Ralf Stubner

Reputation: 26823

In your code the actual parallel workload is not handled by future but by snow::parLapply. You can see that with the following example, where I am using parallel instead of snow, which I would treat as deprecated for simple PSOCK clusters:

RunNM2 <- function(index){
    Sys.sleep(4)
    return(index)
}
library(tictoc)
library(parallel)
cl <- makePSOCKcluster(rep("localhost",8))
tic("cluster")
res <- parLapply(cl,1:8,RunNM2)
toc()
#> cluster: 4.015 sec elapsed
stopCluster(cl)
rm(cl)

Created on 2019-06-04 by the reprex package (v0.3.0)

So currently you are creating one future out of the result of your parallel computation. Instead you should create several futures, that are then evaluated in parallel:

RunNM2 <- function(index){
    Sys.sleep(4)
    return(index)
}
library(tictoc)
library(future)
cl <- makeClusterPSOCK(rep("localhost",8))
plan(cluster, workers = cl)
tic("cluster")
res.1 <- lapply(1:8, function(index) future(RunNM2(index)))
res <- values(res.1)
# blocks here
toc()
#> cluster: 4.66 sec elapsed
parallel::stopCluster(cl)
rm(cl)

Created on 2019-06-04 by the reprex package (v0.3.0)

Note: As per ?cluster the prefered method for creating a ClusterFuture is future() or %<-% after registering a suitable (cluster) plan for execution.

Upvotes: 1

Related Questions