Reputation: 37
I have a code I am trying to process in parallel using the foreach-package. The code is working but when I run it on a computer with 4 cores it takes about 26 min and when I switch to one with 32 cores, it still takes 13 min to finish. I was wondering whether I am doing something wrong since I am using 8 times as much cores, but only reduce the time by one half. My code looks like this:
no_cores <- detectCores()
cl <- makeCluster(no_cores)
registerDoParallel(cl)
Xenopus_Data <- foreach(b=1:length(newly_populated_vec),.packages = c("raster", "gdistance", "rgdal","sp")) %dopar% { Xenopus_Walk(altdata=altdata,water=water,habitat_suitability=habitat_suitability,max_range_without_water=max_range_without_water,max_range=max_range,slope=slope,Start_Pt=newly_populated_vec[b]) }
stopCluster(cl)
For the computer with 4 cores I get the following time:
Time_of_Start
[1] "2016-07-12 13:07:23 CEST"
Time_of_end
[1] "2016-07-12 13:33:10 CEST"
And for the one with 32 cores:
Time_of_Start
[1] "2016-07-12 14:35:48 CEST"
Time_of_end
[1] "2016-07-12 14:48:08 CEST"
Is this normal ? and if so, does anyone know how to speed it up additionally, maybe using different packages? Any type of help is greatly appreciated!
EDIT: these are the times I get after applying the corrections as suggested. For 32 cores:
User System elapsed
5.99 40.78 243.97
For 4 cores:
user system elapsed
1.91 0.94 991.71
Note that before, I did the calculation multiple times via some loops, that's why the computation time decreased so drastically, but one can still tell that the difference between the two computers has increased, I believe.
Upvotes: 2
Views: 297
Reputation: 12935
Try this and let me know if your problem is solved:
library(doParallel)
library(foreach)
registerDoParallel(cores=detectCores())
n <- length(newly_populated_vec)
cat("\nN = ", n, " | Parallel workers count = ", getDoParWorkers(), "\n\n", sep="")
t0 <- proc.time()
Xenopus_Data <- foreach(b=1:n,.packages = c("raster", "gdistance", "rgdal","sp"), .combine=rbind) %dopar% {
Xenopus_Walk(
water=water,
altdata=altdata,
habitat_suitability=habitat_suitability,
max_range_without_water=max_range_without_water,
max_range=max_range,
slope=slope,
Start_Pt=newly_populated_vec[b])
}
TIME <- proc.time() - t0
Also, try to monitor the logical cores in your PC/laptop to check if all cores are involved in the computation. (TaskManager for Windows and htop
for Linux)
Please also be mindful that doubling the number of cores does not necessarily lead to having a double performance.
Upvotes: 1