user222552
user222552

Reputation: 105

How to make sbatch wait until last submitted job is *running* when submitting multiple jobs?

I'm running a numerical model which parameters are in a "parameter.input" file. I use sbatch to submit multiple iterations of the model, with one parameter in the parameter file changing every time. Here is the loop I use:

#!/bin/bash -l
for a in {01..30}
do
  sed -i "s/control_[0-9][0-9]/control_${a}/g" parameter.input
  sbatch --time=21-00:00:00 run_model.sh
  sleep 60
done

The sed line changes a parameter in the parameter file. The run_model.sh file runs the model.

The problem: depending on the resources available, a job might run immediately or stay pending for a few hours. With my default loop, if 60 seconds is not enough time to find resources for job n to run, the parameter file will be modified while job n is pending, meaning job n will run with the wrong parameters. (and I can't wait for job n to complete before submitting job n+1 because each job takes several days to complete)

How can I force batch to wait to submit job n+1 until job n is running?

I am not sure how to create an until loop that would grab the status of job n and wait until it changes to 'running' before submitting job n+1. I have experimented with a few things, but the server I use also hosts another 150 people's jobs, and I'm afraid too much experimenting might create some issues...

Upvotes: 4

Views: 5081

Answers (1)

user222552
user222552

Reputation: 105

Use the following to grab the last submitted job's ID and its status, and wait until it isn't pending anymore to start the next job:

sentence=$(sbatch --time=21-00:00:00 run_model.sh) # get the output from sbatch
stringarray=($sentence)                            # separate the output in words
jobid=(${stringarray[3]})                          # isolate the job ID
sentence="$(squeue -j $jobid)"            # read job's slurm status
stringarray=($sentence) 
jobstatus=(${stringarray[12]})            # isolate the status of job number jobid

Check that the job status is 'running' before submitting the next job with:

if [ "$jobstatus" = "R" ];then
  # insert here relevant code to run next job
fi

You can insert that last snippet in an until loop that checks the job's status every few seconds.

Upvotes: 3

Related Questions