Reputation: 153
I seek to parallize an R file on an SLURM HPC using the future.batchtools packages. While the script is executed on multiple nodes, it only use 1 CPU instead of 12 that are available.
So far, I tried different configurations (c.f. code attached) which do not lead to the expected results. My bash file with the configuration is as follows:
#!/bin/bash
#SBATCH --nodes=2
#SBATCH --cpus-per-task=12
R CMD BATCH test.R output
In R, I use a foreach loop:
# First level = cluster
# Second level = multiprocess
# https://cran.r-project.org/web/packages/future.batchtools/vignettes/future.batchtools.html
plan(list(batchtools_slurm, multiprocess))
# Parallel for loop
result <- foreach(i in 100) %dopar% {
Sys.sleep(100)
return(i)
}
I would appreciate if someone can give me guidance on how to configure the code for multiple nodes and multiple cores.
Upvotes: 7
Views: 1784
Reputation: 551
Since you are running in batch and using more than one node, consider combining MPI with mclapply
's multicore fork. These are closer to what actually happens in hardware and provide control between the number of R instances per node and core use of each instance. Example SLURM and PBS scripts and an accompanying R batch script are in https://github.com/RBigData/mpi_balance, illustrating how to balance multicore and multinode parallelism.
Upvotes: 0
Reputation: 853
There are various ways of parallelizing R jobs with Slurm. A few to note:
You can use a single node with multiple cores, in which case mclapply
is one nice alternative to use since in principle is faster and more memmory efficient compared to, say, parLapply
.
You can use JOB arrays with Slurm, meaning that you can write a single R script that Slurm will run multiple times specifying the option --array=1-[# of replicates]
. You can specify what to do within each job using the SLURM_ARRAY_TASK_ID
environment variable (you can capture that in R using Sys.getenv("SLURM_ARRAY_TASK_ID")
). And use conditionals if/else
statement based on that.
@george-ostrouchov mentioned, you can use MPI for which you would need to have the Rmpi package installed, but that may be a bit painful some times.
Another thing you can try is creating a SOCKET cluster object using the parallel package. This would be a multi-node cluster which would allow you spanning your calculations using parLapply
and others across multiple nodes.
I do have a tutorial in which I explain these options and how can you use the slurmR
package (which I'm working on, soon on CRAN) or the rslurm
(which is already on CRAN). You can take a look at the tutorial here.
Upvotes: 3