jbndlr
jbndlr

Reputation: 5210

Conditionally override python module on import: Design issues

in my setup/design, requesting a module of a subpackage automatically loads all other modules from that package. I seek a way to circumvent this.

Disclaimer

A few months ago, I asked a question which is related. However, since this issue is new and different, I created this as a new question.

For those who want to read on the previous problem and attempt, see Mask a python submodule from its package's __init__.py

The Setup

In one of my packages, I have a subpackage called config holding a bunch of config files for other submodules and subpackages:

mypackage
 |
 +-- subpkg_a
 |    |
 |    +-- __init__.py
 |    +-- <some modules here>.py
 |
 +-- config
 |    |
 .    +-- __init__.py
 .    +-- subpkg_a_sample.py
      +-- .gitignore (ignores everything except __init__ or *_sample.py)

The Requirement

Since the above setup resides in a repository, colleagues can (and shall) clone it in order to use mypackage on their local systems. However, their configuration may be different, which is why I want to provide the ability to override the sample config given in *_sample.py and have them provide their own configuration file only locally.

The rationale is that I want people to contribute code where they can use config settings directly like this (in order to keep the code as general as possible):

from config import subpkg_a as conf_a
print(conf_a.MY_CONFIG_SETTING)

However, if they have to make local adaptions to the configuration files to fit certain settings to their local setup, they should not have to change *_sample.py, as it is part of the repository and contains general-purpose settings and examples. And, if they would change the *_sample.py config file, there are always users who accidently check in their local changes (especially deletions) and thus need to be protected from themselves...

Thus, I need the possibility to override the *_sample.py with a local copy, if a local copy is present, and otherwise load *_sample.py when the certain config is imported.

The Solution I Came Up With

Currently, I am using the following code inside config/__init__.py:

import os
import sys
import imp
import re

# Extend the __all__ list for all sub-packages that provide <pkg>_sample.py config files
__all__ = []
_cfgbase = os.path.dirname(os.path.realpath(__file__))
_r = re.compile('^(?P<key>.+?)_sample\.py$')
for f in os.listdir(_cfgbase):
    m = _r.match(f)
    if m: __all__.append(m.group('key'))

# Load local override file, if any. Otherwise load respective *_sample.py
for cfgmodule in __all__:
    if os.path.isfile(os.path.join(_cfgbase, cfgmodule + '.py')):
        locals()[cfgmodule] = imp.load_source('mypackage.config.' + cfgmodule, os.path.join(_cfgbase, cfgmodule + '.py'))
    else:
        locals()[cfgmodule] = imp.load_source('mypackage.config.' + cfgmodule, os.path.join(_cfgbase, cfgmodule + '_sample.py'))

What this code does, is:

  1. Scan the directory of this __init__.py for files that end in _sample.py and store their name (i.e. everything before _sample.py) in the package's __all__ list. These are the packages for which there are config files available that can be loaded.

  2. Iterate over the __all__ list and freshly import the local override, if there is any, or import the respective *_sample.py otherwise. No matter which file was imported, it is made accessible under the name without the _sample.py extension via the config module. I have done this to keep the code that relies on the config module as clean and generic as possible.

Finally, the Problem

The issue I now am facing is, that even when a user only imports the configuration for a single subpackage, i.e. from config import subpkg_a as conf_a, all other configurations that are available in that directory are immediately loaded as well.

This behavior is obvious by the conditional import in my __init__.py (see above). But since there are configuration files for subpackages, that rely on other imports (e.g. celery or mpl_toolkits.basemap) and that may require a significant effort to be installed under certain environments, any user that just needs a fraction of the configs and subpackages is required to install all packages and modules that are imported in a configuration file. Even if the user and his/her respective code does not require the respective config.

I feel like having established a poor design, but I do not know how to do better. So I ask you:

Do you see any possibility to change the config's __init__.py (or even the entire setup) such that it does not load all config files when a user requests a single one?

I am grateful for hints and answers of any kind. Cheers!

Upvotes: 0

Views: 457

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121524

Just make local configuration explicit.

Have people that want to override configuration create a local_config.py file. You can add that to .gitignore. Have users import that module in their own code.

At the top of the local_config.py module, teach your users to import the desired sample config:

from config.subpkg_a_sample import *

Note the import * here. Now all names from subpkg_a_sample are imported into local_config. The user can easily override anything they need to. Then in the rest of the software, use

try:
    import local_config as config
except ImportError:
    warn('No local configuration available')
    import config.default_config as config

or similar approaches to get a default configuration in place.

Another approach is to add:

try:
    from local_config import *
except ImportError:
    pass

to all your *_sample.py modules, making those modules responsible for applying local configuration. That way names are also overridden by local configuration.

Requiring users to create a local_config.py is no more strenuous to what you already have, where you ask users to pick a sample config and provide overrides for that you magically interpolate.

Upvotes: 2

Related Questions