Reputation: 197
I want to run a simple task using multiprocessing (I think this one is the same as using parfor in matlab correct?)
For example:
from multiprocessing import Pool
def func_sq(i):
fig=plt.plot(x[i,:]) #x is a ready-to-use large ndarray, just want
fig.save(....) #to plot each column on a separate figure
pool = Pool()
pool.map(func_sq,[1,2,3,4,5,6,7,8])
But I am very confused of how to use slurm to submit my job. I have been searching for answers but could not find a good one. Currently, while not using multiprocessing, I am using slurm job sumit file like this:(named test1.sh)
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -p batch
#SBATCH --exclusive
module load anaconda3
source activate py36
srun python test1.py
Then, I type in sbatch test1.sh in my prompt window.
So if I would like to use the multiprocessing, how should I modify my sh file? I have tried by myself but it seems just changing my -n to 16 and Pool(16) makes my job repeat 16 times.
Or is there a way to maximize my performance if multiprocessing is not suitable (I have heard about multithreating but don't know how it exactly works)
And how do I effectively use my memory so that it won't crush? (My x matrix is very large)
For the GPU, is that possible to do the same thing? My current submission script without multiprocessing is:
#!/bin/bash
#SBATCH -n 1
#SBATCH -p gpu
#SBATCH --gres=gpu:1
Upvotes: 5
Views: 5711
Reputation: 10820
The "-n" flag is setting the number of tasks your sbatch submission is going to execute, which is why your script is running multiple times. What you want to change is the "-c" argument which is how many CPUs each task is assigned. Your script spawns additional processes but they will be children of the parent process and share the CPUs assigned to it. Just add "#SBATCH -c 16" to your script. As for memory, there is a default amount of memory your job will be given per CPU, so increasing the number of CPUs will also increase the amount of memory available. If you're not getting enough, add something like "#SBATCH --mem=20000M" to request a specific amount.
Upvotes: 7
Reputation: 6113
I don't mean to be unwelcoming here, but this question seems to indicate that you don't actually understand the tools you're using here. Python Multiprocessing allows a single Python program to launch child processes to help it perform work in parallel. This is particularly helpful because multithreading (which is commonly how you'd accomplish this in other programming languages) doesn't gain you parallel code execution in Python, due to Python's Global Interpreter Lock.
Slurm (which I don't use, but from some quick research) seems to be a fairly high-level utility that allows individuals to submit work to some sort of cluster of computers (or a supercomputer... usually similar concepts). It has no visibility, per se, into how the program it launches runs; that is, it has no relationship to the fact that your Python program proceeds to launch 16 (or however many) helper processes. Its job is to schedule your Python program to run as a black box, then sit back and make sure it finishes successfully.
You seem to have some vague data processing problem. You describe it as a large matrix, but you don't give nearly enough information for me to actually understand what you're trying to accomplish. Regardless, if you don't actually understand what you're doing and how the tools you're using work, you're just flailing until you maybe eventually get lucky enough for this to work. Stop guessing, figure out what these tools do, look around and read documentation, then figure out what you're trying to accomplish and how you could go about splitting up the work in a reasonable fashion.
Here's my best guess, but I really have very little information to work from so it may not be helpful at all:
Pool().map
is probably the right direction to be headed in. Create some Python generator that produces rows of your data matrix, then pass that generator and func_sq
to pool.map
, and sit back and wait for the job to finish.This doesn't sound like a trivial problem, and even if it were, you don't give sufficient details for me to provide a robust answer. There's no "just fix this one line" answer to what you've asked, but I hope this helps give you an idea of what your tools are doing and how to proceed from here.
Upvotes: -2