Reputation: 1166
In the following Slurm batch script, where program step_one
and step_two
are meant to run at the same time, the wait
call is necessary so the job does not terminate before the job steps are done.
#!/bin/bash
#SBATCH --ntasks=2
srun --overlap -n1 step_one &
srun --overlap -n2 step_two &
wait
The wait
blocks until all processes run in the background are done. If another program were to launch the processes for which I need to wait
, how do I achieve the same result? Without going into details about DVC, just believe me that the following launches the same two steps "in the background" and exits before they are done.
#!/bin/bash
#SBATCH --ntasks=2
dvc repro
wait # has no effect ... what would?
For those familiar with DVC, here is the pipeline file:
stages:
one:
cmd: srun --overlap -n1 step_one &
two:
cmd: srun --overlap -n1 step_two &
Here is the closes I can come, but I feel like I'm doing it wrong:
#!/bin/bash
#SBATCH --ntasks=2
dvc repro
while [ $(sstat -n -a -j $SLURM_JOB_ID | wc -l) -gt 1 ]
do
sleep 10
done
Note that sstat
gives me a job step called "$SLURM_JOB_ID.batch", hence -gt 1
.
Update: Solutions to a similar problem (that does not involve Slurm) rely on knowing the PID of the non-child processes. To use those, I would at least need the PIDs.
Upvotes: 2
Views: 1872
Reputation: 1166
Self-answer with my current solution.
My difficulty using DVC with Slurm jobs is that DVC runs stage commands serially (unless you get into queuing experiments, which introduces celery, which would be another queue on top of Slurm ... yikes.). If the stage commands run in the background, however, DVC will chug merrily along. But, you now have to manually enforce the DAG. I did this with advisory file system locking. You also don't want to run DVC commit until the backgrounded commands have completed.
Here's a pipeline with three stages (minimal working examples of <CMD>
given below), note that the DAG allows stages one
and three
to run in parallel while two
must run after one
.
stages:
one:
cmd: flock lock/a <ONE> &
outs:
- one.txt
two:
cmd: flock lock/a <TWO> &
deps:
- one.txt
outs:
- two.txt
three:
cmd: flock lock/b <THREE> &
outs:
- three.txt
The lock/a
and lock/b
files are created by the flock
command and correspond to the two separate branches of the DAG. Using flock
may not be the ultimate solution; the release order of multiple stage commands waiting on the same lock is unclear to me.
Wrap your dvc repro
command in a script something like this:
#!/bin/sh
set -e
mkdir lock
dvc repro --no-commit
for item in lock/*
do
flock $item rm $item
done
rmdir lock
This script would be your sbatch
submission script, but I'm leaving all that out. I'll also leave out the srun
part of the minimal working example below, but you'd need them for Slurm in your stage commands.
When you source job.sh
(or sbatch job.sh
), the commands all fire into the background and DVC exits. The flock mechanism takes over for releasing commands to run, and the script exits after all locks are released (and cleaned up). You would then run dvc commit
.
Here's an example that works without Slurm:
stages:
one:
cmd: flock lock/a ./stamp.sh </dev/null >one.txt &
outs:
- one.txt
two:
cmd: flock lock/a ./stamp.sh <one.txt >two.txt &
deps:
- one.txt
outs:
- two.txt
three:
cmd: flock lock/b ./stamp.sh </dev/null >three.txt &
outs:
- three.txt
With executable stamp.sh
:
#!/bin/sh
echo "time now is $(date +'%T')"
read line
echo $line | sed -e "s/now is/then was/"
sleep 10
Some results:
% source job.sh
Running stage 'three':
> flock lock/b ./stamp.sh </dev/null >three.txt &
WARNING: 'three.txt' is empty.
Running stage 'one':
> flock lock/a ./stamp.sh </dev/null >one.txt &
WARNING: 'one.txt' is empty.
Running stage 'two':
> flock lock/a ./stamp.sh <one.txt >two.txt &
WARNING: 'two.txt' is empty.
Updating lock file 'dvc.lock'
To track the changes with git, run:
git add dvc.lock
To enable auto staging, run:
dvc config core.autostage true
Use `dvc push` to send your updates to remote storage.
% grep "time" *.txt
one.txt:time now is 11:38:58
three.txt:time now is 11:38:58
two.txt:time now is 11:39:08
two.txt:time then was 11:38:58
Upvotes: 1
Reputation: 6294
Just an idea. Unfortunately I don't know a better solution. In DVC it can be solved like this I think (just an idea from the top of my head). (It's not a complete solution! You would need to make the wait
stage depend on the stages one
and two
similar to the dependency on zero
so that wait
doesn't start before them):
stages:
zero:
cmd:
- rm -f res* || true
- echo date > zero
outs:
- zero
always_changed: true
one:
deps:
- zero
cmd: (./process1.sh; echo $? > res1) &
two:
deps:
- zero
cmd: (./process2.sh; echo $? > res2) &
wait:
deps:
- zero
cmd: ./wait.sh
where wait.sh
:
#!/bin/bash
set -eux
while [ ! -f res1 ] || [ ! -f res2 ] ; do sleep 1; done
It becomes ugly pretty quick tbh :( Primarily since there is no mechanism for a stage to depend on another stage w/o an explicit out/dep between them.
If you can make stages output files in some other way in your case (e.g. create a file when they just start) that would simplify the logic a bit.
Upvotes: 0