Reputation: 514
I would like to write slurm batches (sbatch) to run several mpi applications. Thus I would like to be able to run something like that
salloc --nodes=1 mpirun -n 6 hostname
But I get this message :
There are not enough slots available in the system to satisfy the 6 slots that were requested by the application: hostname
Either request fewer slots for your application, or make more slots available for use.
The node has actually 4 CPUs. I therefore looking for something allowing more task per CPU but I cannot find the proper option. I know that mpi alone is able to run several processes when physical resources are missing. I think the problem is on the slurm side. Do you have any suggestions/comments?
Upvotes: 1
Views: 3782
Reputation: 22670
Use srun
and supply the option --overcommit
, e.g. like that:
test.job:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=6
#SBATCH --overcommit
srun hostname
Run sbatch test.job
From man srun
:
Normally, srun will not allocate more than one process per CPU. By specifying
--overcommit
you are explicitly allowing more than one process per CPU.
Note depending on your cluster configuration this may or may not work also with mpirun
, but I'd stick with srun
unless you have a good reason not to.
An important warning: Most MPI implementations by default have terrible performance when running in overcommited. How to address that is a different, much more difficult, question.
Upvotes: 1