Reputation: 56149
Suppose we have myScript.sh as below:
#!/bin/bash
do something with $1 > bla.txt
do something with bla.txt > temp.txt
...
cat temp.txt >> FinalOuput.txt
Then we run parallel as below:
parallel myScript.sh {} ::: {1..3}
Does it write output in order? Will FinalOutput.txt
have results of 1
first, then 2
, and then 3
.
Note: I am currently outputting to separate files then merging them in required order once parallel is complete, just wondering if I could avoid this step.
Upvotes: 2
Views: 100
Reputation: 33685
The ideal way is to avoid tempfiles all together. That can often be done by using pipes:
parallel 'do something {} | do more | something else' ::: * > FinalOutput
But if that is impossible then use tmpfiles that depends on {#} which is the job sequence number in GNU Parallel:
doer() {
do something $1 > $2.bla
do more $2.bla > $2.tmp
something else $2.tmp
}
export -f doer
parallel doer {} {#} ::: * > FinalOutput
Upvotes: 1
Reputation: 311516
The processes are run in parallel. Not only is there no guarantee that they will finish in order, there's not even a guarantee that you can have multiple processes writing to the same file like that and end up with anything useful.
If you are going to be writing to the same file from multiple processes, you should implement some sort of locking to prevent corruption. For example:
while ! mkdir FinalOutput.lock; do
sleep 1
done
cat temp.txt >> FinalOutput.txt
rmdir FinalOutput.lock
If order matters, you should each script write to a unique file, and then assemble the final output in the correct order after all your parallel jobs have finished.
#!/bin/bash
do something with $1 > bla.txt
do something with bla.txt > temp-$1.txt
...
cat temp.txt >> FinalOuput.txt
And then after parallel
has finished:
cat temp-*.txt > FinalOutput.txt
Upvotes: 2