Python use slurm for multiprocessing

Question

I want to run a simple task using multiprocessing (I think this one is the same as using parfor in matlab correct?)

For example:

from multiprocessing import Pool
def func_sq(i):
    fig=plt.plot(x[i,:])     #x is a ready-to-use large ndarray, just want
    fig.save(....)           #to plot each column on a separate figure

pool = Pool()
pool.map(func_sq,[1,2,3,4,5,6,7,8])

But I am very confused of how to use slurm to submit my job. I have been searching for answers but could not find a good one. Currently, while not using multiprocessing, I am using slurm job sumit file like this:(named test1.sh)

#!/bin/bash

#SBATCH -N 1
#SBATCH -n 1
#SBATCH -p batch
#SBATCH --exclusive

module load anaconda3
source activate py36
srun python test1.py

Then, I type in sbatch test1.sh in my prompt window.

So if I would like to use the multiprocessing, how should I modify my sh file? I have tried by myself but it seems just changing my -n to 16 and Pool(16) makes my job repeat 16 times.

Or is there a way to maximize my performance if multiprocessing is not suitable (I have heard about multithreating but don't know how it exactly works)

And how do I effectively use my memory so that it won't crush? (My x matrix is very large)

Thanks!

For the GPU, is that possible to do the same thing? My current submission script without multiprocessing is:

#!/bin/bash

#SBATCH -n 1
#SBATCH -p gpu
#SBATCH --gres=gpu:1

Thanks a lot!

Colin · Accepted Answer

The "-n" flag is setting the number of tasks your sbatch submission is going to execute, which is why your script is running multiple times. What you want to change is the "-c" argument which is how many CPUs each task is assigned. Your script spawns additional processes but they will be children of the parent process and share the CPUs assigned to it. Just add "#SBATCH -c 16" to your script. As for memory, there is a default amount of memory your job will be given per CPU, so increasing the number of CPUs will also increase the amount of memory available. If you're not getting enough, add something like "#SBATCH --mem=20000M" to request a specific amount.

Python use slurm for multiprocessing

Thanks!

Thanks a lot!

Answers (2)

Related Questions