Reputation: 51
I am running some CFD-simulations on a PBS based cluster. I will run a large number of cases, and therefore want to do the pre-processing on the cluster nodes. I need to do two steps, first meshing, and when the meshing is finished, I want to run the mesh partitioning routine. To avoid manual work i would like to program this in a pbs jobscript.
I can run the meshing of all cases in parallel by running the following:
#!/usr/bin/env bash
#PBS -q regular
#PBS -l nodes=1:ppn=8
#PBS -N prep_tst_2
#PBS -l walltime=6:00:00
cd $PBS_O_WORKDIR
hexp -batch -project tst_1.igg &
hexp -batch -project tst_2.igg &
hexp -batch -project tst_3.igg &
hexp -batch -project tst_4.igg &
hexp -batch -project tst_5.igg &
hexp -batch -project tst_6.igg &
hexp -batch -project tst_7.igg &
hexp -batch -project tst_8.igg &
#End of script
Where hexp is the meshing program!
I can also run a meshing task followed by the partitioning by running:
hexp -batch -project tst_1.igg ; partit -batch -project tst_1.igg
But how can I combine the two? I want to run 8 instances of the last command in paralell, so that as the meshing of tst_1.igg is finished it continues with partitioning of tst_1.igg regardless of the status of the other instances.
Best regards, Adam
Upvotes: 3
Views: 1091
Reputation: 932
It looks like this problem would be handled well by GNU Parallel
. If I understand correctly, you want to sequentially run hexp
followed by partit
for a given file. You want the sequence to run in parallel for a number of files. I think you would want to use GNU Parallel
as follows:
First, create a simple bash script that accepts a filename argument and launches the two commands:
#!/bin/bash
hexp -batch -project $1 ; partit -batch -project $1
#name this file hexpart.sh and make it executable
Next, use GNU Parallel
in your PBS script to launch hexpart.sh
on multiple CPUS. In this case, eight files on 8 CPUs on one node:
#!/bin/bash
#PBS -l nodes=1:ppn=8
#Other PBS directives
cd $PBS_O_WORKDIR
module load gnu-parallel # this will depend on your cluster setup
parallel -j8 --sshloginfile $PBS_NODEFILE --workdir $PBS_O_WORKDIR \
`pwd`/hexpart.sh tst_{}.igg' ::: 1 2 3 4 5 6 7 8
#name this file launch.pbs
Then you run qsub launch.pbs
, the parallel
command will run hexpart.sh on the eight files, each on a separate CPU. The filenames will be generated by replacing the {}
with the arguments after :::
. Here is a tutorial for GNU Parallel.
Upvotes: 1
Reputation: 7213
What you are looking for are job dependencies. Let's say that your pre-processing command is placed into a script called preprocess.sh and the partitioning piece that you want to run 8 times is placed in a script called partition.sh
jobid=`qsub preprocess.sh`
for ((i=0; i < 8; i++)); do
qsub partition.sh -W depend=afterok:$jobid
done
This makes the preprocess.sh script a job, and then submits 8 jobs that won't execute unless the first job exits with an exit code of zero. This will work nicely if you have the preprocess script output the results to a network file location that all compute nodes can read and you set up the partition.sh script to read from that same location.
You can read more about job dependencies in the documentation.
Upvotes: 0