Is there a way to use CPUs individually on a cluster with slurm?

Question

I've been using a cluster of 200 nodes with 32 cores each for simulating stochastic processes.

I have to do around 10 000 simulations of the same system, so I am running the same simulation (with different RNG seeds) in 32 cores of one node until it does all the 10 000 simulations. (each simulation is completely independent of the others)

In doing so some of the simulations, depending on the seed, take much more time then the others and after some time I usually have the full node allocated to me but only with one core running (so I am unnecessarily occupying 31 cores).

in my sbatch script I have this:

# Specify the number of nodes(nodes=) and the number of cores per nodes(tasks-pernode=) to be used
#SBATCH -N 1
#SBATCH --ntasks-per-node=32

...

cat list.dat | parallel --colsep '	' -j 32 ./main{}  > "Results/A.out"

which runs the 32 ./main's at a time in the same node until all lines of list.dat are used (10 000 lines).

Is there a way to free this unused cores for other jobs? And is there a way for me to send this 32 jobs to random nodes, that is one job submission using a maximum of 32 cores in (potentially) different nodes (whatever is free at the moment)?

Thank you!

Is there a way to use CPUs individually on a cluster with slurm?

Answers (1)

Related Questions