Meinew
Meinew

Reputation: 514

mpirun with slurm : how to run multiple processes on a single CPU

I would like to write slurm batches (sbatch) to run several mpi applications. Thus I would like to be able to run something like that

salloc --nodes=1 mpirun -n 6 hostname 

But I get this message :

There are not enough slots available in the system to satisfy the 6 slots that were requested by the application: hostname

Either request fewer slots for your application, or make more slots available for use.

The node has actually 4 CPUs. I therefore looking for something allowing more task per CPU but I cannot find the proper option. I know that mpi alone is able to run several processes when physical resources are missing. I think the problem is on the slurm side. Do you have any suggestions/comments?

Upvotes: 1

Views: 3782

Answers (1)

Zulan
Zulan

Reputation: 22670

Use srun and supply the option --overcommit, e.g. like that:

test.job:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=6
#SBATCH --overcommit

srun hostname

Run sbatch test.job

From man srun:

Normally, srun will not allocate more than one process per CPU. By specifying --overcommit you are explicitly allowing more than one process per CPU.

Note depending on your cluster configuration this may or may not work also with mpirun, but I'd stick with srun unless you have a good reason not to.

An important warning: Most MPI implementations by default have terrible performance when running in overcommited. How to address that is a different, much more difficult, question.

Upvotes: 1

Related Questions