Unzip all gz files in all subdirectories in the terminal

Question

Is there a way to unzip all gz files in the folder containing the zipfiles. When zip files are in subdirectories. A query for

find -type f -name "*.gz"

Gives results like this:

./datasets/auto/auto.csv.gz
./datasets/prnn_synth/prnn_synth.csv.gz
./datasets/sleep/sleep.csv.gz
./datasets/mfeat-zernike/mfeat-zernike.csv.gz
./datasets/sonar/sonar.csv.gz
./datasets/wine-quality-white/wine-quality-white.csv.gz
./datasets/ring/ring.csv.gz
./datasets/diabetes/diabetes.csv.g

Olivier Dulac · Accepted Answer

If you want, for each of those, to launch "gzip -d" on them:

cd theparentdir && gzip -d $(find ./ -type f -name '*.gz')

and then, to gzip them back:

cd theparentdir && gzip $(find ./ -type f -name '*.csv')

This will however choke in many cases

if filenames have some special characters (spaces, tabs, newline, etc) in them
other similar cases
or if there are TOO MANY files to be put after the gzip command!

A solution would be instead, if you have GNU find, to do :

find ... -print0 | xarsg -0 gzip -d # for the gunzip one, but still choke on files with "newline" in them

Another (arguably better?) solution, if you have GNU find at your disposal:

cd theparentdir && find ./ -type f -name '*.gz' -exec gzip -d '{}' '+'

and to re-zip all csv in that parentdir & all subdirs:

cd theparentdir && find ./ -type f -name '*.csv' -exec gzip '{}' '+'

"+" tells GNU find to try to put as many found files as it can on each gzip invocation (instead of doing 1 gzip incocation per file, very very ressource intensive and very innefficient and slow), similar to xargs, but with some benefits (1 command only, no pipe needed)

Unzip all gz files in all subdirectories in the terminal

Answers (2)

Related Questions