Why can't I run multiple instances of the same python script simulataniously in SLURM

Question

I have been struggling trying to get multiple instances of a python script to run on SLURM. In my login node I have installed python3.6 and I have a python script "my_script.py" which takes a text file as input to read in run parameters. I can run this script on the login node using

python3.6 my_script.py input1.txt

Furthermore, I can submit a script submit.sh to run the job:

#!/bin/bash
#
#SBATCH --job-name=hostname_sleep_sample
#SBATCH --output=output1.txt
#SBATCH --cpus-per-task=1
#
#SBATCH --mem=2G

python3.6 my_script.py input1.txt

This runs fine and executes as expected. However, if I submit the following script:

#!/bin/bash
#
#SBATCH --job-name=hostname_sleep_sample
#SBATCH --output=output2.txt
#SBATCH --cpus-per-task=1
#
#SBATCH --mem=2G

python3.6 my_script.py input2.txt

while the first is running I get the following error message in output2.txt:

/var/spool/slurmd/job00130/slurm_script: line 9: python3.6: command not 
found

I found that I have this same issue when I try to submit a job as an array. For example, when I submit the following with sbatch:

!/bin/bash
#
#SBATCH --job-name=hostname_sleep_sample 
#SBATCH --output=out_%j.txt
#SBATCH --array=1-10
#SBATCH --cpus-per-task=1
#
#SBATCH --mem=2G
echo PWD $PWD
cd $SLURM_SUBMIT_DIR
python3.6 my_script.py input_$SLURM_ARRAY_TASK_ID.txt
~

I find that only out_1.txt shows that the job ran. All of the output files for tasks 2-10 show the same error message:

/var/spool/slurmd/job00130/slurm_script: line 9: python3.6: command not

I am running all of these scripts in an HPC cluster that I created using the Compute Engine API in the Google Cloud Platform. I used the following tutorial to set up the SLURM cluster:

https://codelabs.developers.google.com/codelabs/hpc-slurm-on-gcp/#0

Why is SLURM unable to run multiple python3.6 jobs at the same time and how can I get my array submission to work? I have spent days going through SLURM FAQs and other stack questions but I have not found out a way to resolve this issue or a suitable explanation of whats causing the issue in the first place.

Thank you

Why can't I run multiple instances of the same python script simulataniously in SLURM

Answers (1)

Related Questions

Why can&#39;t I run multiple instances of the same python script simulataniously in SLURM

Answers (1)

Related Questions

Why can't I run multiple instances of the same python script simulataniously in SLURM