HTF
HTF

Reputation: 7300

BASH: print output on one line

I've got this simple script below to stream compressed MySQL dumps to Amazon S3 bucket in parallel:

#!/bin/bash

COMMIT_COUNT=0
COMMIT_LIMIT=2

for i in $(cat list.txt); do

        echo "$i "

        mysqldump -B $i | bzip2 -zc | gof3r put -b s3bucket -k $i.sql.bz2 &


        (( COMMIT_COUNT++ ))

        if [ ${COMMIT_COUNT} -eq ${COMMIT_LIMIT} ]; then
        COMMIT_COUNT=0
        wait
        fi

done

if [ ${COMMIT_COUNT} -gt 0 ]; then
        wait
fi

The output looks like this:

database1 
database2 
duration: 2.311823213s
duration: 2.317370326s

Is there a way to print this on one line for each dump?

database1 - duration: 2.311823213s
database2 - duration: 2.317370326s

The echo -n switch doesn't help in this case.

EDIT: Wed May 6 15:17:29 BST 2015

I was able to achieve expected results based on accepted answer:

echo "$i -" $(mysqldump -B $i| bzip2 -zc | gof3r put -b s3bucket -k $i.sql.bz2 2>&1) &

- however a command that is running in a subshell is not returning exit status to a parent shell because it's running in parallel so I'm not able to verify if it succeed or failed.

Upvotes: 20

Views: 8511

Answers (7)

hagello
hagello

Reputation: 3265

You try to do parallelization with your script. I'd recommend not to re-invent the wheel but to use a tried and tested tool: GNU parallel. The tutorial is huge: http://www.gnu.org/software/parallel/parallel_tutorial.html

It has a different options for jobs that return with exit value != 0: abort on the first error or continue working till the end.

One of the advantages of GNU parallel to the script of the OP is that it immediately starts the third job as soon as the first one is finished.

Upvotes: 0

nARN
nARN

Reputation: 51

I would make separate function to control all the process and then run this function in background instead of running mysqldump itself.

By doing this you will have several processes running simultaneously and at the same time you'll have control over mysqldump as it was run synchronously

#!/bin/bash

do_job(){
    param=$1
    echo job $param started... >&2  # Output to stderr as stdout is grabbed
    sleep $[$RANDOM/5000]
    echo $RANDOM  # Make some output
    [ $RANDOM -ge 16383 ]  # Generate exit code
}

control_job() {
    param=$1
    output=`do_job $param`
    exit_code=$?
    echo $1 printed $output and exited with $exit_code
}

JOBS_COUNT=0
JOBS_LIMIT=2

for i in database1 database2 database3 database4; do

    control_job $i &

    (( JOBS_COUNT++ ))

    if [ $JOBS_COUNT -ge $JOBS_LIMIT ]; then
        (( JOBS_COUNT-- ))
        wait -n 1  # wait for one process to exit
    fi

done

wait  # wait for all processes running

Here do_job is used in place of your mysqldump pipline. BTW, there's a small improvement here. You probably do not want to wait for all spawned processes when you've reached the limit. It will be enough to wait for arbitrary one. That's what wait -n 1 does

Upvotes: 0

bal
bal

Reputation: 159

Expanding on your answer, to exit the script immediately upon failure, you have to save the pids of the background processes in an array. In your while loop add pids[COMMIT_COUNT]=$! after the mysqldump command.

Then you could write a function to loop over all these pids, and exit if one of them failed:

wait_jobs() {
    for pid in "${pids[@]}"; do
        wait ${pid}
        if [ $status -ne 0 ]; then
            echo "ERROR: Backups failed"
            exit 1
        fi
    done
}

Call this function instead of wait $(jobs -p) in the script.

Notes

You can replace the pids array with jobs -p in the for loop, but then you will not get the pids of jobs that completed before the call to the loop.

The wait_jobs() function above cannot be used in a subshell, the exit 1 call will only terminate the subshell then.


The complete script:

#!/bin/bash

COMMIT_COUNT=0
COMMIT_LIMIT=2

wait_jobs() {
    for pid in "${pids[@]}"; do
        wait ${pid}
        if [ $status -ne 0 ]; then
            echo "ERROR: Backups failed"
            exit 1
        fi
    done
}

while read -r i; do

    mysqldump -B $i | bzip2 -zc | gof3r put -b s3bucket -k $i.sql.bz2 |& xargs -I{} echo "${DB} - {}" &
    # save the pid of the background job so we can get the
    # exit status with wait $pid later
    pids[COMMIT_COUNT]=$!

    (( COMMIT_COUNT++ ))

    if [ ${COMMIT_COUNT} -eq ${COMMIT_LIMIT} ]; then
        COMMIT_COUNT=0
        wait_jobs
    fi

done < list.txt

wait_jobs

Upvotes: 3

HTF
HTF

Reputation: 7300

Thanks for all your help but I think I've finally found an optimal solution for this.

Basically I used xargs to format the output so each entry (dump name + duration time) is on one line. I also added the job spec to wait command to get the exit status:

man bash

wait [n ...] Wait for each specified process and return its termination status. Each n may be a process ID or a job specification; if a job spec is given, all processes in that job's pipeline are waited for. If n is not given, all currently active child processes are waited for, and the return status is zero. If n specifies a non-existent process or job, the return status is 127. Otherwise, the return status is the exit status of the last process or job waited for.

Test:

# sh -c 'sleep 5; exit 1' &
[1] 29970
# wait; echo $?
0
# sh -c 'sleep 5; exit 1' &
[1] 29972
# wait $(jobs -p); echo $?
1

Final script:

#!/bin/bash

COMMIT_COUNT=0
COMMIT_LIMIT=2


while read -r i; do

    mysqldump -B $i | bzip2 -zc | gof3r put -b s3bucket -k $i.sql.bz2 |& xargs -I{} echo "${DB} - {}" &

    (( COMMIT_COUNT++ ))

    if [ ${COMMIT_COUNT} -eq ${COMMIT_LIMIT} ]; then
        COMMIT_COUNT=0
        wait $(jobs -p)
    fi

done < list.txt

if [ ${COMMIT_COUNT} -gt 0 ]; then
     wait $(jobs -p)
fi

if [ $? -ne 0 ]; then
     echo "ERROR: Backups failed"
     exit 1
fi  

Upvotes: 5

tivn
tivn

Reputation: 1923

Regarding your additional question about exit status, let me write another answer. Because $() will run a subshell, I don't think it is possible to return the exit status to the main shell like normal command would. But it is possible to write the exit status to a file to be examined later. Please try command below. It will create file called status-$i.txt containing two lines. One is for mysqldump, the other for gof3r.

e="status-$i.txt"
echo -n > $e

echo "$i -" $( \
      ( mysqldump -B $i 2>&1; echo m=$? >> $e ) \
    |   bzip2 -zc \
    | ( gof3r put -b s3bucket -k $i.sql.bz2 2>&1; echo g=$? >> $e ) \
) &

You may also need to clean-up all status-*.txt files at the start of your script.

Upvotes: 1

nanaya
nanaya

Reputation: 500

untested, etc.

#!/bin/sh

COMMIT_COUNT=0
COMMIT_LIMIT=2

_dump() {
  # better use gzip or xz. There's no benefit using bzip2 afaict
  output="$(mysqldump -B "$1" | bzip2 -zc | gof3r put -b s3bucket -k "$1.sql.bz2" 2>&1)"
  [ "$?" != 0 ] && output="failed"
  printf "%s - %s\n" "$1" "$output"
}

while read i; do
  _dump "$i" &

  (( COMMIT_COUNT++ ))

  if [ ${COMMIT_COUNT} -eq ${COMMIT_LIMIT} ]; then
    COMMIT_COUNT=0
    wait
  fi
done < list.txt

wait

Upvotes: -3

tivn
tivn

Reputation: 1923

I think this command will do what you want:

echo "$i -" `(mysqldump -B $i | bzip2 -zc | gof3r put -b s3bucket -k $i.sql.bz2) 2>&1` &

Or, use $() in place of backticks :

echo "$i -" $( (mysqldump -B $i| bzip2 -zc | gof3r put -b s3bucket -k $i.sql.bz2) 2>&1 ) &

The echo command will wait for mysqldump .. result to finish before try to print together with $i. The sub-shell ( … ) and error redirection 2>&1 ensure that error messages go into the echoed output too. The space after the $( is necessary because $(( without a space is a different special operation — an arithmetic expansion.

Upvotes: 7

Related Questions