Reputation: 125
I am writing a Python package that has to use external resources. The user can choose to use its own version of the resources, or simply stick to the default one, embedded in the package. Now, I would like to handle the package resources in a similar way as the externally supplied resources, that I can access using the filesystem features. Is there a standard way to do this in Python ?
More precisely, the organization of my project is roughly as follows:
package/
├── __init__.py
├── src.py
└── resources
├── __init__.py
└── lib
├── dir1
| ├── dir1
│ ├── file1
│ └── ...
└── dir2
├── file1
└── ...
The main embedded resource is lib
, which is a directory containing an arbitrary number of nested directories and files. The user can invoke a script using either script
(which should use package/resources/lib
) or script ./path/to/resource
(which should use the directory ./path/to/resource
).
The issue comes from the fact that I strongly rely on the directory structure of the resources, in order to parse it entirely. In particular, I am now handling the files in a resource directory using pathlib.Path.glob
. Though we can work with embedded resource files using pkg_resources.resource_stream
for example, I have not found a way to handle resource directories and regular directories similarly.
Is there an API that allows to do it ? The main feature I am looking for is the ability to list all the files under a directory, be it in an embedded resource or in the filesystem.
Since packaged resources may be compressed, I think that I should use something different from pathlib
, which could provide a "Directory
" class that allows to work with regular directories as well as compressed resource directories. Another possibility would be to extract resources to a regular directory prior to using them, but it seems to be against the principle of the resource system.
Upvotes: 1
Views: 1642
Reputation: 4519
In Python 3.12+, importlib.resources
from Python's own standard library can be used to access whole directories from the package resources via its files
and as_file
methods:
from importlib import resources
traversable = resources.files("package.resources")
with resources.as_file(traversable) as path:
for file in path.glob("*"):
print(file)
Note that the directory may only be accessed within the with
block, because if it had to be extracted into a temporary file (as is necessary if the package is installed in the form of an archive), it will be cleaned up once the context manager returned by as_file
is exited.
This method should normally be preferred over methods involving pkg_resources
, because as the latter's own documentation acknowledges, importlib.resources
is its official (and more performant) successor. And it avoids the dependency on pkg_resources
.
Upvotes: 0
Reputation: 125
The pkg_resources
package allows to do exactly this. As mentioned in the Resource Extraction section of the documentation, resource_filename(package_or_requirement, resource_name)
allows to access a resource in a true filesystem. In particular, if the resource is compressed, it extracts it to a cache directory and returns the cached path.
Thus, listing the files in the resources.lib
directory can be done with for example:
path = pkg_resources.resource_filename("package.resources", "lib")
for file in Path(path).glob("*"):
print(file)
Upvotes: 1