Reputation: 663
I'm trying to support multiple versions of a python package without impacting client code.
Consider the following repo:
.
|-- client_code.py
`-- lib
|-- __init__.py
|-- foo.py
`-- bar/baz.so
client_code.py:
from lib.foo import f
from lib.bar.baz import g
...
f()
g()
I'd like to leave client_code.py
unchanged but also have access to both versions of the library. I would ideally like something like this:
lib
|-- __init__.py
|-- v1
| |-- __init__.py
| |-- foo.py
| `-- bar/baz.so
`-- v2
|-- __init__.py
|-- foo.py
`-- bar/baz.so
lib/__init__.py:
import os
if os.environ.get("USE_V2", "0") == "0": # Or some other runtime check
from .v1 import *
else:
from .v2 import *
However, the client code fails with the following error:
Traceback (most recent call last):
File "client_code.py", line 1, in <module>
from lib.foo import f
ImportError: No module named foo
Note that the problem doesn't have to do with __all__
since the following would also fail with the same exception:
if os.environ.get("USE_V2", "0") == "0":
from .v1 import foo
else:
from .v2 import foo
I feel like something like this has to be possible, but I'm having a hard time finding the right keywords to search for, so I'm asking here.
The reason for requiring this (as opposed to just having different runtime environments) is because I would like the library version being used to be user-specified at runtime (e.g., use a specific .so
compiled for a given GPU architecture). I could create separate Docker images for all of the permutations, but that would be excessively cumbersome.
A more restricted version of this question was previously asked here (Support two versions of a python package without clients needing to change code). The accepted solution, however, requires a separate "mirror" library where a separate file is required for each module in lib
. The sole purpose of which is to redirect to v1
or v2
based on the runtime variable.
Is it possible to have a single redirection point for all submodules nested under lib
?
Any help would be much appreciated. Thanks ahead of time!
Upvotes: 2
Views: 1428
Reputation: 4368
When your main script does import lib.foo
, the Python import system will look iteratively in the directories in your PYTHONPATH
. For each one, it will search for a lib
package (a directory containing an __init__.py
file) which itself contains a foo
package.
But your lib
library does not contain a foo
package, but a v1.foo
and a v2.foo
packages.
A first solution :
from lib import foo
foo.hello()
It works because we do not try to import a lib.foo
package, instead we use the foo
object defined in the lib
module (which happens to be module too, but as far as we are concerned it looks like any other Python object). And importing the lib
package works fine.
But I understand that requiring your library to be used exactly this way is not very user-friendly and prone to error.
If we want to make import foo.lib
to work as you want, we have to cheat a bit with the import system :
# main.py
import lib.foo
lib.foo.hello()
# lib/__init__.py
import os
import sys
if os.environ.get("USE_V2", "0") == "0":
from .v1 import foo
else:
from .v2 import foo
sys.modules["lib.foo"] = foo # <---- cheating here
Here we manipulate sys.modules
which is the cache for loaded modules.
The first time your Python program imports a certain package, the Python runtime will search the corresponding file on the disk, compile it to bytecode, construct the module object, store it in the cache and then bind the module object to the name you want in your context.
Put into example :
import math # search, read, compile, create module object, store to cache, bind to "math"
print(math.pi) # the name "math" is now defined
import math as math2 # hit the cache, bind the module object to "math2"
print(math2.pi) # the name "math2" is now defined
print(math is math2) # True
So we can mess with the cache ourselves, that's the line sys.modules["lib.foo"] = foo
. It tells the Python runtime that for future imports, if asked for lib.foo
, it should give the foo
module.
This also makes one of your two packages importable from two qualified names :
import lib.foo
import lib.v1.foo
import lib.v2.foo
print(lib.foo is lib.v1.foo) # True when USE_V2=0
print(lib.foo is lib.v2.foo) # True when USE_V2=1
But is has side effects that may not like :
foo.lib
package. There may be ways to solve this problem, I don't know (maybe providing a type hints file ?).sys.modules
documentation :
This [dictionnary] can be manipulated to force reloading of modules and other tricks. However, replacing the dictionary will not necessarily work as expected and deleting essential items from the dictionary may cause Python to fail.
There are many footguns with manipulating this dictionnary.
Upvotes: 2