SLURM: How can I run different executables on parallel on the same compute node or in different nodes?

Question

Goal:

learn how to run or co-schedule or execute executables/applications with a sbatch job submission
using either srun or mpirun

Research:

Code snippet:

 #!/bin/bash
 #SBATCH --job-name LEBT 
 #SBATCH --partition=angel
 #SBATCH --nodelist=node38
 #SBATCH --sockets-per-node=1
 #SBATCH --cores-per-socket=1
 #SBATCH --time 00:10:00 
 #SBATCH --output LEBT.out

 # the slurm module provides the srun command
 module load openmpi


 srun  -n 1   ./LU.exe -i 100 -s 100  &
 srun  -n 1   ./BT.exe  &

 wait

Man Pages:

 [srun]-->[https://computing.llnl.gov/tutorials/linux_clusters/man/srun.txt]

 [mpirun]-->[https://www.open-mpi.org/doc/v1.8/man1/mpirun.1.php]

itsmrbeltre · Accepted Answer

After extensive research, I have concluded that "srun" is the command you want to use to run jobs on parallel. Moreover, you need a helper script to be able to adequately execute the whole process. I have written the following script to execute the applications in one node with no problem.

#!/usr/bin/python
#SBATCH --job-name TPython
#SBATCH --output=ALL.out
#SBATCH --partition=magneto
#SBATCH --nodelist=node1


import threading
import os

addlock = threading.Lock()

class jobs_queue(threading.Thread):
    def __init__(self,job):
            threading.Thread.__init__(self,args=(addlock,))
            self.job = job
    def run(self):
            self.job_executor(self.job)

    def job_executor(self,cmd):
            os.system(cmd)

if __name__ == __main__:

    joblist =  ["srun  ./executable2",
                "srun  ./executable1 -i 20 -s 20"]

    #creating a thread of jobs 
    threads = [jobs_queue(job)  for job in joblist]

    #starting jobs in the thread 
    [t.start() for t in threads]

    #no interruptions 
    [t.join()  for t in threads]

Both executables in my particular case with the particular flags activated yield around 55 seconds each. However, when they were ran on parallel, they both yield 59 seconds execution time.

SLURM: How can I run different executables on parallel on the same compute node or in different nodes?

Answers (2)

Related Questions