RoyS
RoyS

Reputation: 111

Executing bash script on multiple lines inside multiple files in parallel using GNU parallel

I want to use GNU parallel for the following problem:

I have a few files each with several lines of text. I would like to understand how I can run a script (code.sh) on each line of text of each file and for each file in parallel. I should be able to write out the output of the operation on each input file to an output file with a different extension.

Seems this is a case of multiple parallel commands running parallel over all files and then running parallel for all lines inside each file.

This is what I used:

ls mydata_* |
    parallel -j+0 'cat {} | parallel -I ./explore-bash.sh > {.}.out'

I do not know how to do this using GNU parallel. Please help.

Upvotes: 2

Views: 1198

Answers (1)

Ole Tange
Ole Tange

Reputation: 33685

Your solution seems reasonable. You just need to remove -I:

ls mydata_* | parallel -j+0 'cat {} | parallel ./explore-bash.sh > {.}.out'

Depending on your setup this may be faster as it will only run n jobs, where as the solution above will run n*n jobs in parallel (n = number of cores):

ls mydata_* | parallel -j1 'cat {} | parallel ./explore-bash.sh > {.}.out'

Upvotes: 3

Related Questions