Gauthier
Gauthier

Reputation: 41945

bash wait exit on error code

Let's consider this as a starting point:

#!/bin/bash
set -e

echo "Sleeping..."
sleep 2 &

wait

echo "Done"
exit 0

I would like wait to exit the whole script if the background process exited with error. Introducing an error as such:

#!/bin/bash
set -e

echo "Sleeping..."
sleep SOMETHING_STRANGE_AND_WRONG &

wait

echo "Done"
exit 0

does echo "Done". I was expecting wait to exit the script, because of set -e.

I know that I can save the pid of sleep and check the return value of the background process this way:

#!/bin/bash
set -e

echo "Sleeping..."
sleep SOMETHING_STRANGE_AND_WRONG &
pid=$!

if wait $pid; then
    echo "Success"
else
    echo "Failure!"
    exit 1
fi

echo "Done"
exit 0

However, this gets cumbersome when I have several such "sync points" within my script, and several subprocesses to wait for at each of these points.

I am not very interested in the error codes themselves, only that they're not success.

Is there a less verbose way to make wait fail and exit (because of set -e) if any of the subprocesses it was waiting for did not succeed?


Edit: I am looking for a solution where wait fails and exit if any of the subprocesses fails:

#!/bin/bash
set -e

echo "Sleeping..."
sleep SOMETHING_STRANGE_AND_WRONG &
sleep 2 &

wait

echo "Done"
exit 0

which I currently solve this way (which I find cumbersome):

#!/bin/bash
set -e

echo "Sleeping..."
pids=""
sleep SOMETHING_STRANGE_AND_WRONG &
pids+=" $!"
sleep 2 &
pids+=" $!"

for p in $pids; do
    if wait $p; then
        echo "Success"
    else
        echo "Failure"
        exit 1
    fi
done

echo "Done"
exit 0

Upvotes: 5

Views: 6008

Answers (3)

rowanthorpe
rowanthorpe

Reputation: 435

Because your shebang lines are Bash I will give a Bash-specific (non-POSIX-portable) answer first (and a less elegant portable version below that).

Bash has a concise/elegant/robust way which responds to each child as they finish, not in a hardcoded loop order. The POSIX-portable version has to use a hardcoded loop, and is about as good as you can do portably. For both versions with small tweaks they can either handle-and-exit as soon as they encounter the first failure or wait until all have finished then handle-and-exit, and either way can then wait for parent-exit after all children or not (which is useful when not doing so may lead to race-conditions or zombies in your larger program).

The pertinent points regarding Bash's non-portable "wait" used in its version are:

  • optflag -f: forces "wait" to wait for id to terminate before returning its status, instead of returning when it changes status (you may or may not want this, depending on your use-case)
  • optflag -n: waits for a single job from the list of ids or, if no ids are supplied, any job, to complete and returns its exit status
  • exit status 127: is returned if none of the supplied arguments is a child of the shell, or if no arguments are supplied and the shell has no unwaited-for children

Here are the key logic-snippets.

Bash version:

set -e
declare -i err=0 werr=0
while wait -fn || werr=$?; ((werr != 127)); do
  err=$werr
done

POSIX portable shell version:

set -e
werr=0
err=0
for pid in $pids; do
  wait $pid || werr=$?
  ! [ $werr = 127 ] || break
  err=$werr
done

Both versions include handling, optionally straight away or after all children have exited, and optionally waiting for parent-exit or not too (see lines for uncommenting).

Bash version:

#!/usr/bin/env bash

sleep 2 &
sleep SOMETHING_STRANGE_AND_WRONG &
sleep 1 &

set -e
declare -i err=0 werr=0
while wait -fn || werr=$?; ((werr != 127)); do
  err=$werr
  ## To handle *as soon as* first failure happens uncomment this:
  #((err == 0)) || break
done
## If you want to still wait for children to finish before exiting
## parent (even if you handle the failed child early) uncomment this:
#trap 'wait || :' EXIT
if ((err == 0)); then
  echo "Success"
else
  echo "Failure!"
  exit $err
fi

POSIX portable shell version:

#!/usr/bin/env sh

pids=''
sleep 2 & pids="$pids $!"
sleep SOMETHING_STRANGE_AND_WRONG & pids="$pids $!"
sleep 1 & pids="$pids $!"

set -e
werr=0
err=0
for pid in $pids; do
  wait $pid || werr=$?
  ! [ $werr = 127 ] || break
  err=$werr
  ## To handle *as soon as* first failure happens uncomment this:
  #[ $err = 0 ] || break
done
## If you want to still wait for children to finish before exiting
## parent (even if you handle the failed child early) uncomment this:
#trap 'wait || :' EXIT
if [ $err = 0 ]; then
  echo "Success"
else
  echo "Failure!"
  exit $err
fi

Upvotes: 3

Nahuel Fouilleul
Nahuel Fouilleul

Reputation: 19315

Shorter would be

wait || exit $?

Or if message needed, if not already logged by failing process

wait || { echo "background failed: $?" >&2; exit 1;}

or a function could be used instead

exit_fail() {
    echo "$1" >&2
    exit 1
}

...
wait || exit_fail "background failed: $?"

Upvotes: 2

Gauthier
Gauthier

Reputation: 41945

I implemented the functionality by hand in a function.

# Wait for subprocesses with pids passed as arguments, exit if any failed.
# $1 info_string: to show the command that the failure/success relates to.
# $* list of pids to wait for and check.
wait_and_check () {
    info_string=$1
    shift
    for p in $*; do
        if wait $p; then
            echo "$info_string process $p success"
        else
            echo "$info_string process $p Failure!"
            exit 1
        fi
    done
}

Usage:

pids=""
for p in $bunch_of_stuff ; do
    stuff_function $p &
    pids+=" $!"
done

wait_and_check "stuff" $pids

This feels like many should have had the need for something similar, so I'm surprised there isn't a ready-made solution for that.

One shortcoming is that it's hard to check for the error code of the processes.

Upvotes: 1

Related Questions