Reputation: 1212
I'm queuing up a bunch of jobs through qsub in a loop currently
for fn in $FNS; do
queue_job $(options_a $fn) $(options_b $fn)
done
queue_job
is a script that queues up jobs using qsub and options_a/b are functions I wrote that add a few job options based on filename. I queue up to 5k jobs this way and I'd like to just add them all to the queue instantly (or in larger blocks such as 40/time) instead of in a loop.
I know I can send lines to xargs
and execute them in parallel as
??? | xargs -P 40 -I{} command {}
but I'm not sure how to translate my for loop to xargs
Upvotes: 3
Views: 1051
Reputation: 33685
Using GNU Parallel it looks like this:
export -f options_a
export -f options_b
parallel -j40 'queue_job $(options_a {}) $(options_b {})' ::: $FNS
Upvotes: 0
Reputation: 70392
xargs
is not needed.
If you background the task, the next task can be taken up immediately. You can add some intelligence to your script so that it caps itself on the number of simultaneous tasks. For example:
COUNT=1
LIMIT=40
for fn in $FNS; do
queue_job $(options_a $fn) $(options_b $fn) &
if [ $COUNT -lt $LIMIT ] ; then
COUNT=$[COUNT+1]
continue
fi
wait -n
done
wait
The queue_job
command is placed in the background. The if
body continues to spawn parallel queue_job
tasks until COUNT
reaches the LIMIT
. If COUNT
has reached LIMIT
, then the loop waits for one of the running tasks to complete before spawning the next task. The trailing wait
lets the script block until all the tasks have completed.
I tested this by simulating queue_job
with a 2 second sleep, 30 tasks, and limiting to 10 parallel tasks. As expected, the simulation completed after about 6 seconds.
Upvotes: 0
Reputation: 14442
The qsub
interface allows for submitting one job at a time - it does not provide bulk
submission, which will limit the upside of submitting jobs in parallel (job submission is usually fast).
For the specific case, there are two (bash) functions (namely, options_a
and options_b
), which will expand to job specific parameters, based on the filename. This may limit direct execution with xargs, as suggested by the comments - the functions are unlikely to be available in the path.
Options:
Create a wrapper for queue_job
that will source (or include) the functions. Use the wrapper from xargs
xargs -P40 -I{} queue_job_x1 '{}'
queue_job_x1
#! /bin/bash
function options_a {
...
}
function option_b {
...
}
queue_job $(options_a $fn) $(options_b $fn)'
Might be a good idea to put relevant functions into .sh
file, which can be sourced by multiple scripts.
Upvotes: 2