Reputation: 71
I have a Python application in a directory dir
. This directory has a __main__.py
file and several data files that are read by the application using open(...,'r')
. Without editing the code, it it possible to bundle the code and data files into a single zip file and execute it using something like python app.pyz
python dir
works fine.python -m zipfile -c app.pyz dir/*
, the resulting application will run but cannot read the files. This makes sense.Can I bundle code and data into one file?
Upvotes: 1
Views: 603
Reputation: 91
As of Python 3.9 you can use importlib.resources
from the standard library. This module uses Python's import machinery to resolve the paths of data files as though they were modules inside a package.
Create a new package inside dir
. Let's call it data
. Make sure it has an __init__.py
.
Add your data files to data
. Let's say you added a text file text.txt
and a binary file binary.dat
.
Now from your __main__.py
script or any part of your code with access to the module data
, you can access files inside that package like so:
text.txt
to memory as a string:txt_file = importlib.resources.files("data").joinpath("text.txt").read_text(encoding="utf-8")
binary.dat
to memory as bytes:bin_file = importlib.resources.files("data").joinpath("binary.dat").read_bytes()
path = importlib.resources.files("data").joinpath("text.txt")
with path.open("rt", encoding="utf-8") as file:
lines = file.readlines()
# As streams:
textio_stream = importlib.resources.files("data").joinpath("text.txt").open("rt", encoding="utf-8")
bytesio_stream = importlib.resources.files("data").joinpath("binary.dat").open("rb")
with open()
) without having to modify it:# Old, incompatible with zipfiles.
file_path = "data/text.txt"
with open(file_path, "rt", encoding="utf-8") as file:
lines = file.readlines()
# New, compatible with zipfiles.
file_path = importlib.resources.files("data").joinpath("text.txt")
# If file is inside a zipfile, unzips it in a temporary file, then
# destroys it once the context manager closes. Otherwise, reads the file normally.
with importlib.resources.as_file(file_path) as path:
with open(path, "rt", encoding="utf-8") as file:
lines = file.readlines()
# Since it is a context manager, you can even store it like this:
file_path = importlib.resources.files("data").joinpath("text.txt")
real_path = importlib.resources.as_file(file_path)
with real_path as path:
with open(path, "rt", encoding="utf-8") as file:
lines = file.readlines()
The Traversable
objects returned from importlib.resources
functions can be mixed with Path
objects using as_posix
, since joinpath requires posix separators:
file_path = pathlib.Path("subdirectory", "text.txt")
txt_file = importlib.resources.files("data").joinpath(file_path.as_posix()).read_text(encoding="utf-8")
You can use slashes to grow a Traversable
, just like pathlib.Path
objects:
resources_root = importlib.resources.files("data")
text_path = resources_root / "text.txt"
bin_file = (resources_root / "subdirectory" / "bin.dat").read_bytes()
You can also import the data
package like any other package, and use the module object directly. Subpackages are also supported. The only Python files inside the data
tree are the __init__.py
files of each subpackage:
# __main__.py
import importlib.resources
import data.config
import data.models.b
# Load binary file `file.dat` from `data.models.b`.
# Subpackages are being used as subdirectories.
bin_file = importlib.resources.files(data.models.b).joinpath("file.dat").read_bytes()
...
You technically only need to make your resource root directory be a package. For max brevity:
# __main__.py
from importlib.resources import files
data = files("data") # Resources root.
# In this example, `models` and `b` are regular directories:
bin_file = (data / "models" / "b" / "file.dat").read_bytes()
...
Note that importlib.resources
and zipfiles in general support reading only and you will get an exception if you try to write to any file-like object returned from the above functions. It might technically be possible to support modifying data files inside zips but this is way out of scope. If you want to write files, just open a file in the filesystem as normal.
Now your data files have become file-system agnostic and your program should work via zipapp and normal invocation just the same.
Upvotes: 3