user3345989
user3345989

Reputation: 71

Make use of all CPUs on SLURM

I would like to run a job on the cluster. There are a different number of CPUs on different nodes and I have no idea which nodes will be assigned to me. What are the proper options so that the job can create as many tasks as CPUs on all nodes?

#!/bin/bash -l

#SBATCH -p normal
#SBATCH -N 4
#SBATCH -t 96:00:00

srun -n 128 ./run

Upvotes: 6

Views: 2409

Answers (1)

j23
j23

Reputation: 3530

One dirty hack to achieve the objective is using the environment variables provided by the SLURM. For a sample sbatch file:

#!/bin/bash
#SBATCH --job-name=test
#SBATCH --output=res.txt
#SBATCH --time=10:00
#SBATCH --nodes=2
echo $SLURM_CPUS_ON_NODE
echo $SLURM_JOB_NUM_NODES   
num_core=$SLURM_CPUS_ON_NODE
num_node=$SLURM_JOB_NUM_NODES
let proc_num=$num_core*$num_node
echo $proc_num
srun -n $proc_num ./run

Only the number of nodes are requested in the job script. $SLURM_CPUS_ON_NODE will provide the number of cpus per node. You can use it along with other environment variables (eg: $SLURM_JOB_NUM_NODES) to know the number of tasks possible. In the above script dynamic task calculation is done with the assumption that the nodes are homogenous (i.e $SLURM_CPUS_ON_NODE will give only single number ).

For heterogeneous nodes, $SLURM_CPUS_ON_NODE will give multiple values (eg: 2,3 if the nodes allocated has 2 and 3 cpus). In such scenario, $SLURM_JOB_NODELIST can be used to find out the number of cpus corresponding to the allocated nodes and with that you can calculate the required tasks.

Upvotes: 5

Related Questions