ojunk
ojunk

Reputation: 899

SLURM: Embarrassingly parallel program inside an embarrassingly parallel program

I have a complex model written in Matlab. The model was not written by us and is best thought of as a "black box" i.e. in order to fix the relevant problems from the inside would require rewritting the entire model which would take years.

If I have an "embarrassingly parallel" problem I can use an array to submit X variations of the same simulation with the option #SBATCH --array=1-X. However, clusters normally have a (frustratingly small) limit on the maximum array size.

Whilst using a PBS/TORQUE cluster I have got around this problem by forcing Matlab to run on a single thread, requesting multiple CPUs and then running multiple instances of Matlab in the background. An example submission script is:

#!/bin/bash
<OTHER PBS COMMANDS>
#PBS -l nodes=1:ppn=5,walltime=30:00:00
#PBS -t 1-600

<GATHER DYNAMIC ARGUMENTS FOR MATLAB FUNCTION CALLS BASED ON ARRAY NUMBER>

# define Matlab options
options="-nodesktop -noFigureWindows -nosplash -singleCompThread"

for sub_job in {1..5}
do
    <GATHER DYNAMIC ARGUMENTS FOR MATLAB FUNCTION CALLS BASED ON LOOP NUMBER (i.e. sub_job)>
    matlab ${options} -r "run_model(${arg1}, ${arg2}, ..., ${argN}); exit" &
done
wait
<TIDY UP AND FINISH COMMANDS>

Can anyone help me do the equivalent on a SLURM cluster?

Upvotes: 5

Views: 2615

Answers (3)

Oren
Oren

Reputation: 5309

You can do it with python and subprocess, in what I describe below you just set the number of nodes and tasks and that is it, no need for an array, no need to match the size of the array to the number of simulations, etc... It will just execute python code until it is done, more nodes faster execution.

Also, it is easier to decide on variables as everything is being prepared in python (which is easier than bash).

It does assume that the Matlab scripts save the output to file - nothing is returned by this function (it can be changed..)

In the sbatch script you need to add something like this:

#!/bin/bash
#SBATCH --output=out_cluster.log
#SBATCH --error=err_cluster.log
#SBATCH --time=8:00:00
#SBATCH --nodes=36
#SBATCH --exclusive
#SBATCH --cpus-per-task=2

export IPYTHONDIR="`pwd`/.ipython"
export IPYTHON_PROFILE=ipyparallel.${SLURM_JOBID}

whereis ipcontroller

sleep 3
echo "===== Beginning ipcontroller execution ======"
ipcontroller --init --ip='*' --nodb --profile=${IPYTHON_PROFILE} --ping=30000 & # --sqlitedb
echo "===== Finish ipcontroller execution ======"
sleep 15
srun ipengine --profile=${IPYTHON_PROFILE} --timeout=300 &
sleep 75
echo "===== Beginning python execution ======"

python run_simulations.py

depending on your system, read more here:https://ipyparallel.readthedocs.io/en/latest/process.html

and run_simulations.py should contain something like this:

import os
from ipyparallel import Client
import sys
from tqdm import tqdm
import subprocess
from subprocess import PIPE
def run_sim(x):
    # here we use subprocess, but you can also use any python function.
    import os
    import subprocess
    from subprocess import PIPE
    
    # send job!
    params = [str(i) for i in x]
    p1 = subprocess.Popen(['matlab','-r',f'"run_model({x[0]},{x[1]})"'], env=dict(**os.environ))
    p1.wait()

    return

##load ipython parallel
rc = Client(profile=os.getenv('IPYTHON_PROFILE'))
print('Using ipyparallel with %d engines', len(rc))
lview = rc.load_balanced_view()
view = rc[:]
print('Using ipyparallel with %d engines', len(rc))
sys.stdout.flush()
map_function = lview.map_sync

to_send = []
#prepare variables  <-- here you should prepare the arguments for matlab
####################
for param_1 in [1,2,3,4]:
    for param_2 in [10,20,40]:
        to_send.append([param_1, param_2])



ind_raw_features = lview.map_async(run_sim,to_send)
all_results = []

print('Sending jobs');sys.stdout.flush()
for i in tqdm(ind_raw_features,file=sys.stdout):
    all_results.append(i)

You also get a progress bar in the stdout, which is nice... you can also easily add a check to see if the output files exist and ignore a run.

Upvotes: 1

Tom de Geus
Tom de Geus

Reputation: 5965

I am not a big expert on array jobs but I can help you with the inner loop.

I would always use GNU parallel to run several serial processes in parallel, within a single job that has more than one CPU available. It is a simple perl script, so not difficult to 'install', and its syntax is extremely easy. What it basically does is to run some (nested) loop in parallel. Each iteration of this loop contains a (long) process, like your Matlab command. In contrast to your solution it does not submit all these processes at once, but it runs only N processes at the same time (where N is the number of CPUs you have available). As soon as one finishes, the next one is submitted, and so on until your entire loop is finished. It is perfectly fine that not all processes take the same amount of time, as soon as one CPU is freed, another process is started.

Then, what you would like to do is to launch 600 jobs (for which I substitute 3 below, to show the complete behavior), each with 5 CPUs. To do that you could do the following (whereby I have not included the actual run of matlab, but that trivially can be included):

#!/bin/bash
#SBATCH --job-name example
#SBATCH --out job.slurm.out
#SBATCH --nodes 1
#SBATCH --ntasks 1
#SBATCH --cpus-per-task 5
#SBATCH --mem 512
#SBATCH --time 30:00:00
#SBATCH --array 1-3

cmd="echo matlab array=${SLURM_ARRAY_TASK_ID}"

parallel --max-procs=${SLURM_CPUS_PER_TASK} "$cmd,subjob={1}; sleep 30" ::: {1..5}

Submitting this job using:

$ sbatch job.slurm

submits 3 jobs to the queue. For example:

$ squeue | grep tdegeus
         3395882_1     debug  example  tdegeus  R       0:01      1 c07
         3395882_2     debug  example  tdegeus  R       0:01      1 c07
         3395882_3     debug  example  tdegeus  R       0:01      1 c07

Each job gets 5 CPUs. These are exploited by the parallel command, to run your inner loop in parallel. Once again, the range of this inner loop may be (much) larger than 5, parallel takes care of the balancing between the 5 available CPUs within this job.

Let's inspect the output:

$ cat job.slurm.out

matlab array=2,subjob=1
matlab array=2,subjob=2
matlab array=2,subjob=3
matlab array=2,subjob=4
matlab array=2,subjob=5
matlab array=1,subjob=1
matlab array=3,subjob=1
matlab array=1,subjob=2
matlab array=1,subjob=3
matlab array=1,subjob=4
matlab array=3,subjob=2
matlab array=3,subjob=3
matlab array=1,subjob=5
matlab array=3,subjob=4
matlab array=3,subjob=5

You can clearly see the 3 times 5 processes run at the same time now (as their output is mixed).

No need in this case to use srun. SLURM will create 3 jobs. Within each job everything happens on individual compute nodes (i.e. as if you were running on your own system).


Installing GNU Parallel - option 1

To 'install' GNU parallel into your home folder, for example in ~/opt.

  1. Download the latest GNU Parallel.

  2. Make the directory ~/opt if it does not yet exist

    mkdir $HOME/opt
    
  3. 'Install' GNU Parallel:

    tar jxvf parallel-latest.tar.bz2
    cd parallel-XXXXXXXX
    ./configure --prefix=$HOME/opt
    make
    make install
    
  4. Add ~/opt to your path:

    export PATH=$HOME/opt/bin:$PATH
    

    (To make it permanent, add that line to your ~/.bashrc.)


Installing GNU Parallel - option 2

Use conda.

  1. (Optional) Create a new environment

    conda create --name myenv
    
  2. Load an existing environment:

    conda activate myenv
    
  3. Install GNU parallel:

    conda install -c conda-forge parallel 
    

Note that the command is available only when the environment is loaded.

Upvotes: 6

Milliams
Milliams

Reputation: 1534

While Tom's suggestion to use GNU Parallel is a good one, I will attempt to answer the question asked.

If you want to run 5 instances of the matlab command with the same arguments (for example if they were communicating via MPI) then you would want to ask for --ncpus-per-task=1, --ntasks=5 and you should preface your matlab line with srun and get rid of the loop.

In your case, as each of your 5 calls to matlab are independent, you want to ask for --ncpus-per-task=5, --ntasks=1. This will ensure that you allocate 5 CPU cores per job to do with as you wish. You can preface your matlab line with srun if you wish but it will make little difference you are only running one task.

Of course, this is only efficient if each of your 5 matlab runs take the same amount of time since if one takes much longer then the other 4 CPU cores will be sitting idle, waiting for the fifth to finish.

Upvotes: 4

Related Questions