Olivier
Olivier

Reputation: 2111

Running programs in parallel using xargs

I currently have the current script.

#!/bin/bash
# script.sh

for i in {0..99}; do
   script-to-run.sh input/ output/ $i
done

I wish to run it in parallel using xargs. I have tried

script.sh | xargs -P8

But doing the above only executed once at the time. No luck with -n8 as well. Adding & at the end of the line to be executed in the script for loop would try to run the script 99 times at once. How do I execute the loop only 8 at the time, up to 100 total.

Upvotes: 134

Views: 161510

Answers (3)

Peter Frost
Peter Frost

Reputation: 561

Here's an example running commands in parallel in conjuction with find:

find -name "*.wav" -print0 | xargs -0 -t -I % -P $(nproc) flac %

-print0 terminates filenames with a null byte rather than a newline so we can use -0 in xargs to prevent filenames with spaces being treated as two separate arguments.

-t means verbose, makes xargs print every command it's executing, can be useful, remove if not needed.

-I % means replace occurrences of % in the command with arguments read from standard input.

-P $(nproc) means run a maximum of nproc instances of our command in parallel (nproc prints the number of available processing units).

flac % is our command, the -I % from earlier means this will become flac foo.wav

See also: Manual for xargs(1)

Upvotes: 20

Shubham Gupta
Shubham Gupta

Reputation: 357

You can use this simple 1 line command

seq 1 500 | xargs -n 1 -P 8 script-to-run.sh input/ output/

Upvotes: 10

Etan Reisner
Etan Reisner

Reputation: 80931

From the xargs man page:

This manual page documents the GNU version of xargs. xargs reads items from the standard input, delimited by blanks (which can be protected with double or single quotes or a backslash) or newlines, and executes the command (default is /bin/echo) one or more times with any initial- arguments followed by items read from standard input. Blank lines on the standard input are ignored.

Which means that for your example xargs is waiting and collecting all of the output from your script and then running echo <that output>. Not exactly all that useful nor what you wanted.

The -n argument is how many items from the input to use with each command that gets run (nothing, by itself, about parallelism here).

To do what you want with xargs you would need to do something more like this (untested):

printf %s\\n {0..99} | xargs -n 1 -P 8 script-to-run.sh input/ output/

Which breaks down like this.

  • printf %s\\n {0..99} - Print one number per-line from 0 to 99.
  • Run xargs
    • taking at most one argument per run command line
    • and run up to eight processes at a time

Upvotes: 196

Related Questions