Spunky
Spunky

Reputation: 41

How to run a fixed number of processes in a loop?

I have a script like this:

#!/bin/bash
for i=1 to 200000
do
create input file
run ./java
done

I need to run a number (8 or 16) of processes (java) at the same time and I don't know how. I know that wait could help but it should be running 8 processes all the time and not wait for the first 8 to finish before starting the other 8.

Upvotes: 4

Views: 4247

Answers (4)

Mark Setchell
Mark Setchell

Reputation: 207345

Use GNU Parallel like this, simplified to 20 jobs rather than 200,000 and the first job is echo rather than "create file" and the second job is sleep rather than "java".

seq 1 20 | parallel -j 8 -k 'echo {}; sleep 2'

The -j 8 says how many jobs to run at once. The -k says to keep the output in order.

Here is a little animation of the output so you can see the timing/sequence:

enter image description here

Upvotes: 2

With a non-ancient version of GNU utilities or on *BSD/OSX, use xargs with the -P option to run processes in parallel.

#!/bin/bash
seq 200000 | xargs -P 8 -n 1 mytask

where mytask is an auxiliary script, with the sequence number (the input line) available as the argument$1`:

#!/bin/bash
echo "Task number $1"
create input file
run ./java

You can put everything in one script if you want:

#!/bin/bash
seq 200000 | xargs -P 8 -n 1 sh -c '
  echo "Task number $1"
  create input file
  run ./java
' mytask

If your system doesn't have seq, you can use the bash snippet

for ((i=1; i<=200000; i++)); do echo "$i"; done

or other shell tools such as

awk '{for (i=1; i<=200000; i++) print i}' </dev/null

or

</dev/zero tr '\0' '\n' | head -n 200000 | nl

Upvotes: 1

chepner
chepner

Reputation: 530853

Set up 8 subprocesses that read from a common stream; each subprocess reads one line of input and starts a new job whenever its current job completes.

forker () {
    while read; do
        # create input file
        ./java
    done
}

cores=8 # or 16, or whatever
for ((i=1; i<=200000; i++)); do
    echo $i
done | while :; do
    for ((j=0; j< cores; j++)); do
         forker &
    done
done
wait   # Waiting for the $core forkers to complete

Upvotes: 0

chepner
chepner

Reputation: 530853

bash 4.3 added a useful new flag to the wait command, -n, which causes wait to block until any single background job, not just the members of a given subset (or all), to complete.

#!/bin/bash
cores=8  # or 16, or whatever
for ((i=1; i <= 200000; i++))
do
    # create input file and run java in the background.
    ./java &

    # Check how many background jobs there are, and if it
    # is equal to the number of cores, wait for anyone to
    # finish before continuing.
    background=( $(jobs -p) )
    if (( ${#background[@]} == cores )); then
        wait -n
    fi
done

There is a small race condition: if you are at maximum load but a job completes after you run jobs -p, you'll still block until another job completes. There's not much you can do about this, but it shouldn't present too much trouble in practice.


Prior to bash 4.3, you would need to poll the set of background jobs periodically to see when the pool dropped below your threshold.

while :; do
    background=( $(jobs -p))
    if (( ${#background[@]} < cores )); then
        break
    fi
    sleep 1
done

Upvotes: 8

Related Questions