Tarun Maganti
Tarun Maganti

Reputation: 3076

How to use GNU parallel with find -exec?

I want to unzip multiple files,

Using this answer, I found the following command.

find -name '*.zip' -exec sh -c 'unzip -d "${1%.*}" "$1"' _ {} \;

How do I use GNU Parallel with the above command to unzip multiple files?


Edit 1: As per questions by user Mark Setchell

Where are the files ?

All the zip files are generally in a single directory.

But, as per my assumption, the command finds all the files even if recursively/non-recursively according to the depth given in find command.

How are the files named?

abcd_sdfa_fasfasd_dasd14.zip

how do you normally unzip a single one?

unzip abcd_sdfa_fasfasd_dasd14.zip -d abcd_sdfa_fasfasd_dasd14

Upvotes: 14

Views: 13612

Answers (2)

YenForYang
YenForYang

Reputation: 3284

You could also using the + variant of -exec. It starts parallel after find has completed, but also allows for you to still use -print/-printf/-ls/etc. and possibly abort the find before executing the command:

find . -type f -name '*.zip' -ls -exec parallel unzip -d {.} ::: {} \+

Note that GNU Parallel also uses {} to specify the input arguments. In this case, however, we use {.} to strip the extension like shown in your example. You can override the GNU Parallel's replacement string {} with -I (for example, using -I@@ allows for you to use @@ instead of {}).

I recommend using GNU Parallel's --dry-run flag or prepending unzip with an echo to test the command first and see what would be executed.

Upvotes: 6

Inian
Inian

Reputation: 85570

You can first use find with the -print0 option to NULL delimit files and then read back in GNU parallel with the NULL delimiter and apply the unzip

find . -type f -name '*.zip' -print0 | parallel -0 unzip -d {/.} {}

The part {/.} applies string substitution to get the basename of the file and removes the part preceding the . as seen from the GNU parallel documentation - See 7. Get basename, and remove last ({.}) or any ({:}) extension You can further set the number of parallel jobs that can be run with the -j flag. e.g. -j8, -j64

Upvotes: 22

Related Questions