Reputation: 15122
Given a task with several commands combined by pipe:
cat input/file1.json | jq '.responses[0] | {labelAnnotations: .labelAnnotations}' > output/file1.json
Now, there are thousands of input JSON files, and I like to leverage GNU Parallel to parallelize all process. How could I do that? Something like this?
parallel cat {} | jq '...' > output/{./} ::: input/*.json
note: It gets even more complicated if there is a pipe inside jq
's filter...
Upvotes: 6
Views: 3972
Reputation: 33685
https://www.gnu.org/software/parallel/man.html#QUOTING says:
Conclusion: To avoid dealing with the quoting problems it may be easier just to write a small script or a function (remember to
export -f
the function) and have GNU parallel call that.
In your case it will look like this:
doit() {
cat "$1" |
jq '.responses[0] | {labelAnnotations: .labelAnnotations}' > "$2"
}
export -f doit
parallel doit {} output/{/} ::: input/*.json
A nice thing about this is that you can test it:
doit input/foo1.json output/foo1.json
And when that works, parallelizing it is trivial.
If you have newer version of GNU Parallel this should work, too:
parallel --results output/{/} -q jq '.responses[0] | {labelAnnotations: .labelAnnotations}' ::: input/*.json
Upvotes: 5