prantik sarmah
prantik sarmah

Reputation: 11

Parallel jobs in SLURM

How can I run a number of python scripts in different nodes in SLURM?

Suppose, I select 5 cluster nodes using #SBATCH --nodes=5

and

I have 5 python scripts code1.py, code2.py....code5.py and I want to run each of these scripts in 5 different nodes simultaneously. How can I achieve this?

Upvotes: 1

Views: 706

Answers (1)

Marcus Boden
Marcus Boden

Reputation: 1685

Do these five scripts need to run in a single job? Do they really need to run simultaneously? Is there some communication happeneing between them? Or are they independent from one another?

If they are essentially independent, then you should most likely pu tthem into 5 different jobs with one nodes each. That way you don't have to find five free nodes, but your the first job can start as soon as there is a single free node. If there are many scripts you want to start like that, it might be interesting to look into job arrays.

If you need to run them in parallel, you will need to use srun in your jobscript to start the scripts. This example shows a job where you have 10 cores per task and each node has one task.

#!/bin/bash
#[...]
#SBATCH -N 5
#SBATCH -n 5
#SBATCH -c 10
#[...]

srun -N 1 -n1 python code1.py &
srun -N 1 -n1 python code2.py &
srun -N 1 -n1 python code3.py &
srun -N 1 -n1 python code4.py &
srun -N 1 -n1 python code5.py &
wait

You need to run the srun calls in the background, as bash would otherwise wait for them to finish before executing the next one.

Upvotes: 3

Related Questions