Reputation: 11725
I've never written anything this intense in bash. Basically, I want to run a limited number of data import scripts in parallel. To do so, I need to know when one has terminated in order to start the next. However, I'm not sure how to do this in parallel. The following works synchronously:
# watch the outputfile for "DONE!"
tail -f $outputfile | while read OUTPUT
do
if [[ "${OUTPUT}" == *"DONE!"* ]]
then
runNextScript
fi
done
How can I run this asynchronously?
Upvotes: 1
Views: 79
Reputation: 51990
Basically, I want to run a limited number of data import scripts in parallel. To do so, I need to know when one has terminated in order to start the next.
One way of doing that is to create a fifo containing as much tokens as the maximum number of concurrent scripts.
Then, before launching a task, you first consume a token, actually launch the task, and finally put back the token in the fifo. That way, when the maximum number of working script is reached, the next one is blocked until a token is available.
Not clear? Here is a proof of concept (you definitively have to adapt to your needs!):
#!/bin/bash rm -f fifo mkfifo fifo exec 3<>fifo # Simulate 26 tasks tasks=$(exec echo {a..z}) #insert 5 tokens in the fifo #that is at max 5 worker working at the same time for i in {1..5}; do (echo T >&3; echo Insert token) & done # launch the tasks when a token is available for i in $tasks; do read <&3 ( ./worker.sh $i; echo T >&3 ) & done wait
#!/bin/bash
# simulate doing some stuff S=$(( RANDOM % 10 )) echo "$(exec date +%s) PID$$ doing task $1 for $S" sleep $S
Here is a transcript of a session:
sh$ ./master.sh
Insert token
Insert token
Insert token
Insert token
Insert token
1405456428 PID3039 doing task a for 0
1405456428 PID3041 doing task b for 0
1405456428 PID3046 doing task e for 5
1405456428 PID3043 doing task c for 5
1405456428 PID3045 doing task d for 8
1405456428 PID3055 doing task f for 4
1405456428 PID3057 doing task g for 0
1405456428 PID3066 doing task h for 6
1405456432 PID3070 doing task i for 2
1405456433 PID3074 doing task j for 3
1405456433 PID3077 doing task k for 0
1405456433 PID3082 doing task l for 9
1405456434 PID3086 doing task m for 3
1405456434 PID3089 doing task n for 5
1405456436 PID3094 doing task o for 7
1405456436 PID3097 doing task p for 7
1405456437 PID3102 doing task q for 2
1405456439 PID3106 doing task r for 3
1405456439 PID3109 doing task s for 3
1405456442 PID3114 doing task t for 7
1405456442 PID3118 doing task u for 5
1405456442 PID3121 doing task v for 7
1405456443 PID3126 doing task w for 9
1405456443 PID3129 doing task x for 3
1405456446 PID3134 doing task y for 9
1405456447 PID3138 doing task z for 1
The total execution time is around 20s, when the total "worked time" by the workers is 113s. If I'm not too wrong, that factor 5 is corresponding to the 5 workers working in parallel.
Upvotes: 1