Vineel
Vineel

Reputation: 1788

How to read gz compressed files from tar

Let's say we have a tar file which in turn contains multiple gzip compressed files. I want to be able to read the contents of those gzip files without compressing either the tar file or the individual gzip files. I 'm trying to use tarfile module in python.

Upvotes: 0

Views: 626

Answers (1)

jmunsch
jmunsch

Reputation: 24139

This might work, I haven't tested it, but this has the main ideas, and related tools. It iterates over the files in the tar, and if they are gzipped, then will read them into the file_contents variable:

import tarfile as t
import gzip as g 
for member in t.open("your.gz.tar").getmembers():
    fo=t.extractfile(member)
    file_contents = g.GzipFile(fileobj=fo).read()

note: if the file is too large for memory, then consider looking into a streamed reader (chunk by chunk) as linked.

If you have additional logic based on what the member (TarInfo) object looks like you can use these:

see:

Upvotes: 1

Related Questions