parallel package very slow initialization when used inside another R package

Question

For testing purposes, inside my R package I placed the following function:

parsetup <- function(){
  cl <- parallel::makeCluster(12,type='PSOCK')
  parallel::clusterCall(cl,function() 1+1)
}

When I run mypkg::parsetup(), it takes ~ 6s to complete. When I run parsetup2 <- mypkg:parsetup(); parsetup2() in the global environment, it takes ~ 6s to complete. When I run the code definining the parsetup function in the global environment, and then run parsetup(), it takes ~ 0.3s

This seems rather silly to me, can anyone explain why and or suggest a workaround? Adding 6s to every function where I want to use parallelisation is pretty frustrating.

edit: Difference in time occurs during the clusterCall, number of cluster nodes created is 12 in each case.

sessionInfo()
R version 4.0.4 (2021-02-15)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    
system code page: 65001

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ctsem_3.4.3    testthat_3.0.2 profvis_0.3.7  Rcpp_1.0.6

parallel package very slow initialization when used inside another R package

Answers (1)

Related Questions