uday
uday

Reputation: 6713

python compressing multiple data using gzip or zlib

Suppose I have multiple data: say 3 python arrays (or numpy arrays) and 2 list of strings.

How can I store each of the data in compressed binary format within the same zip file?

I looked at the documentation at https://docs.python.org/3.4/library/gzip.html and the examples just showed how to write a single data that use gzip.open to open a file, and writelines to write out the single data.

I am using Python 3.4

Upvotes: 1

Views: 1980

Answers (1)

babbageclunk
babbageclunk

Reputation: 8731

To put multiple files in a gzipped file, use tarfile.open with a mode of w:gz. Then you can use the addfile method to put serialized objects into it (using a StringIO as the fileobj).

import numpy
np_array_data = numpy.zeros(100)
list_of_strs = ['abc'] * 100

import io
import pickle

np_array_data = io.BytesIO()
numpy.save(np_array_data, np_array)
np_array_data.seek(0)
str_data = io.BytesIO()
pickle.dump(list_of_strs, str_data)
str_data.seek(0)
with tarfile.open('output.tar.gz', mode='w:gz') as dest_file:
    dest_file.addfile(tarfile.TarInfo('np_data'), np_array_data)
    dest_file.addfile(tarfile.TarInfo('str_data'), str_data)

If you only wanted to put a number of numpy arrays into a compressed file, you could just use numpy.savez_compressed.

Upvotes: 2

Related Questions