senorsmile
senorsmile

Reputation: 760

bash: ensure one line at a time from stdout written to file when backgrounding multiple processes

I have the following script:

loop=0
for did in $(echo "$dids")
do
    echo "$did"
    loop=$((loop+1))
    if [ $loop -lt 10 ] ; then
        ./account_locate_by_phonenumber.sh "$did" 2>/dev/null >>accounts.csv &
    else
        wait
        #make sure to call locate script again or we'll skip this phone number
        ./account_locate_by_phonenumber.sh "$did" 2>/dev/null >>accounts.csv &
        loop=0
    fi
done

It's calling another script account_locate_by_phonenumber.sh which gets results from a remote mysql server via ssh.

If I don't background the calls to that script I have no problem. The problem comes when backgrounding the calls. I currently do ten iterations then call wait in order to briefly pause and not completely overwhelm the remote server. Most lines written are fine, but every so often (between 10-50 lines) two lines are written at once, and the output is all mixed up.

I assume I need to somehow capture the input and then write it all at once, or per set of iterations, but I'm blanking on how this could be done.

Upvotes: 0

Views: 137

Answers (2)

Barmar
Barmar

Reputation: 782105

Write each iteration's output to a different file and concatenate them.

loop=0
for did in $(echo "$dids")
do
    echo "$did"
    loop=$((loop+1))
    if [ $loop -lt 10 ] ; then
        ./account_locate_by_phonenumber.sh "$did" 2>/dev/null >accounts.csv.$loop &
    else
        wait
        cat accounts.csv.* >> accounts.csv
        rm accounts.csv.*
        #make sure to call locate script again or we'll skip this phone number
        ./account_locate_by_phonenumber.sh "$did" 2>/dev/null >accounts.csv.0 &
        loop=0
    fi
done
wait
cat accounts.csv.* >> accounts.csv

Upvotes: 2

tinkertime
tinkertime

Reputation: 3042

If you're tied to using Bash here... how about:

  • You write your output to 10 separate numbered csv files
  • Replace the time interval wait with a wait for all csv files to populated
  • Add a combine script after the 10th iteration to push all csv files into your main accounts.csv

Otherwise, I would suggest using a language that has more library support for multiple threads.

Upvotes: 1

Related Questions