Pierre
Pierre

Reputation: 133

How to untar `.tgz` directory, and gzip one of the extracted files in memory?

TL;DR
How can I untar a file .tgz, and then selectively gzip the output? My extracted dir has a few text files and a .nii file. I'd like to gzip the later.

More details
First method would be to just do sequentially. However I'm dealing with a huge dataset (10k+ tar archives) stored on a BeeGFS file system and I was told it would be better to do it in memory instead in two steps, since BeeGFS doesn't like handling big directories like this.

Sequential method:

for tarfile in ${rootdir}/*.tgz; do
  tarpath="${tarfile%.tgz}"
  tar zxvf ${tarfile}       # (1) untar directory
  gzip ${tarpath}/*.nii     # (2) gzip the .nii file
done

Is there a way to combine (1) and (2)? Or do you have any other tips on how to do this process effectively?
Thanks!

Upvotes: 1

Views: 244

Answers (1)

Shawn
Shawn

Reputation: 52529

You can extract a single file from the archive (If you know the filename), and have tar write it to standard output instead of to a file with -O, and then compress that stream and redirect it to a file. Something like

tar xzOf "$tarfile" "$tarpath/foo.nii" | gzip -c > "$tarpath/foo.nii.gz"

You can then extract everything else in the archive with tar xzf "$tarfile" --exclude "*.nii"

Upvotes: 2

Related Questions