dzieciou
dzieciou

Reputation: 4514

Reading gzipped package resources

I have created a Python library that you can install from PyPI. It opens a binary file (resource) from within a Python package:

with importlib.resources.open_binary(package_name, 'file.txt') as f:
    ...

Now I decided to compress file.txt to file.txt.gz. If it was outside a regular file I could open it with gzip

with gzip.open('file.txt.gz', 'rb') as f:

or with smart_open:

with gzip.open('file.txt.gz', 'rb') as f:

But it lives inside of a library. How do I open .gz file in such a case?

Upvotes: 0

Views: 187

Answers (1)

pauljohn32
pauljohn32

Reputation: 2265

This worked for me with Python 3.9 and Pandas 1.5.1. My Pandas file is csv and has been gzipped. The compression parameter to pandas lets it know what to do.

filename = 'file.csv.gz'
import gzip
with resources.open_binary("my.package.folder", filename) as fo:
    data_out = pd.read_csv(fo, compression="gzip")

In your case, with more general "txt.gz", why wouldn't we look for similar approach. I don't know what you think the exact correct answer would be, but consider:

import gzip
with resources.open_binary("my.package.folder", filename) as fo:
    with gzip.open(fo) as fo2:
        yy = fo2.read()
        zz = yy.decode("utf-8")

I'm almost sure that zz is exactly what you want.

Upvotes: 0

Related Questions