ashwin shanker
ashwin shanker

Reputation: 313

Python read file within tar archive

I have a file:"docs.tar.gz".The tar file has 4 files inside of which the fourth file is "docs.json" which is what I need.Im able to view the contents of the tar file using:

import tarfile
tar=tarfile.open("docs.tar.gz")
tar.getmembers()

How would I read the fourth file -the json file that I need?..Im unable to proceed after extracting the contents.Thanks!

Upvotes: 5

Views: 9984

Answers (3)

user1717828
user1717828

Reputation: 7225

As an example using Python3's context managers, a JSON file like this:

$ cat myfile.json
{
    "key1": 1,
    "key2": 2,
    "key3": null
}

is compressed with

tar czvf myfile.json.tar.gz myfile.json

and can be extracted like this

import tarfile
import json

tar_file_name = "myfile.json.tar.gz"
data_file_name = "myfile.json"
with tarfile.open(tar_file_name, "r:gz") as tar:
    with tar.extractfile(data_file_name) as f:
        j = json.loads(f.read())

print(j)
# {'key1': 1, 'key2': 2, 'key3': None}

Upvotes: 0

Stephen Lin
Stephen Lin

Reputation: 4912

This one will work too.

import tarfile
tar = tarfile.open("docs.tar.gz")
files = tar.getmembers()
f = tar.extractfile(files[0]) # if your docs.json is in the 0th position
f.readlines()

Upvotes: 5

nathancahill
nathancahill

Reputation: 10850

Try this:

import tarfile
tar = tarfile.open("docs.tar.gz")
f = tar.extractfile("docs.json")

# do something like f.read()
# since your file is json, you'll probably want to do this:

import json
json.loads(f.read())

Upvotes: 5

Related Questions