Leonardo
Leonardo

Reputation: 1901

Manage Python module dependency through a clever import

I'm writing a python module that abstracts the use of three different hardware debugging tools for different CPU architectures.

To illustrate my problem, let's assume that I'm writing a configuration database that abstracts the use of XML, YAML, and JSON files to store stuff:

import xml.etree.ElementTree as ET
import json
import yaml

class abstract_data:
    def __init__(self, filename):
        '''
        Loads a file regardless of the type

        Args:
            filename (str): The file that has the data
        '''
        if filename.endswith('.xml'):
            self.data = ET.parse(filename)
        else:
            with open(filename) as f:
                if filename.endswith('.json'):
                    self.data = json.load(f)
                elif filename.endswith('.yaml'):
                    self.data = yaml.load(f, Loader=yaml.FullLoader)

    def do_something(self):
        print(self.data)

def main():
    d = abstract_data("test.yaml")
    d.do_something()

if __name__ == "__main__":
    # execute only if run as a script
    main()

However, I know for a fact that 99 % of my users will only use JSON files, and that setting up the other two libraries isn't very easy.

However, if I just put the imports on top of my source code like PEP-8 states, will create a dependency of the three libraries for all users. And I'd like to avoid that.

My (probably bad solution) is to use conditional imports, like so:

class abstract_data:
    def __init__(self, filename):
        '''
        Loads a file regardless of the type

        Args:
            filename (str): The file that has the data
        '''
        if filename.endswith('.xml'):
            import xml.etree.ElementTree as ET
            self.data = ET.parse(filename)
        else:
            with open(filename) as f:
                if filename.endswith('.json'):
                    self.data = json.load(f)
                elif filename.endswith('.yaml'):
                    import yaml
                    self.data = yaml.load(f, Loader=yaml.FullLoader)

While this seems to work on a simple module, is this the best way to handle this problem? Are there any side-effects?

Please note that I'm using XML, JSON, and YAML as an illustrative case of three different imports.

Thank you very much!

Upvotes: 2

Views: 89

Answers (2)

syntonym
syntonym

Reputation: 7384

Short answer: Look at the entrypoints library to get loaders and setuptools to register loaders.

One way is to use one individual file per "loading method":

file json_loader:

import json

def load(filename):
    with open(filename) as f:
        return json.load(f)

file xml_loader:

import xml.etree.ElementTree as ET

def load(filename):
    return ET.parse(filename)

But to know whether one of these is supported you have to try to import them and catch any import errors:

import os
# other imports

loaders = {}

try:
    from json_loader import load as json_load
    loaders["json"] = json_load
except ImportError:
    print("json not supported")

...

file_ext = os.path.splitext(file_name)[1]
self.data = loaders[file_ext](file_name)

You could also move the registering code into the modules themselves, so that you would only need to except ImportErrors in the main script:

file loaders.py:

loaders = {}

file xml_loader.py:

import xml.etree.ElementTree as ET
import loaders

def load(filename):
    return ET.parse(filename)

loaders.loaders["xml"] = load

file main.py:

try:
    import xml_loader
except ImportError:
    pass

There is also the entrypoints library, which does something similar to the solution above but integrated with setuptools:

import entrypoints

loaders = {name: ep.load() for name, ep in entrypoints.get_group_named("my_database_configurator.loaders").items()}

...

file_ext = os.path.splitext(file_name)[1]
self.data = loaders[file_ext](file_name)

To register the entrypoints you need a setup.py for each loader (see e.g. here). You can of course combine this with abstract classes and optional dependencies. One advantage is that you don't need to change anything to support more extensions, just installing a plugin is enough to register it. Having multiple implementations for one extension is also possible, the user just installs the one she wants and it gets automatically used.

Upvotes: 1

Brad Solomon
Brad Solomon

Reputation: 40918

If you want to keep your class's implementation as it is now, a common pattern is to set a flag based on the import success, at the top of the module:

try:
    import yaml
except ImportError:
    HAS_YAML = False
else:
    HAS_YAML = True

class UnsupportedFiletypeError(Exception):
    pass

Especially for a larger project, it can be useful to put this into a single module, attempt to make the import only once, and then use that fact elsewhere. For example, put the below in _deps.py and use from ._deps import HAS_YAML.

Then later:

# ...
elif filename.endswith('.yaml'):
    if not HAS_YAML:
        raise UnsupportedFiletypeError("You must install PyYAML for YAML file support")

Secondly, if this is an installable Python package, consider using extras_require.

That would let the user do something like:

pip install pkgname[yaml]

Where, if pkgname[yaml] is specified rather than just pkgname, then PyYAML is installed as a dependency.

Upvotes: 3

Related Questions