Reputation: 12603
To support extensions in my Python project, I'm trying to create a pseudo-module that will serve "extension modules" as it's submodules. I'm having a problem treating the submodules as modules - it seems like I need to access them using from..import
on the main pseudo-module and can't just access their full path.
Here is a minimal working example:
import sys
from types import ModuleType
class Foo(ModuleType):
@property
def bar(self):
# Here I would actually find the location of `bar.py` and load it
bar = ModuleType('foo.bar')
sys.modules['foo.bar'] = bar
return bar
sys.modules['foo'] = Foo('foo')
from foo import bar # without this line the next line fails
import foo.bar
This works, but if I comment out the from foo import bar
line, it'll fail with:
ImportError: No module named bar
on Python2, and on Python3 it'll fail with:
ModuleNotFoundError: No module named 'foo.bar'; 'foo' is not a package
If I add the fields to make it a package:
class Foo(ModuleType):
__all__ = ('bar',)
__package__ = 'foo'
__path__ = []
__file__ = __file__
It'll fail on:
ModuleNotFoundError: No module named 'foo.bar'
From what I understand, the problem is that I did not set sys.modules['foo.bar']
yet. But... to fill sys.modules
I need to load the module first, and I don't want to do it unless the user of my project explicitly imports it.
Is there any way to make Python realize that when it sees import foo.bar
it needs to load foo
first(or I can just guarantee foo
will already be loaded at that point) and take bar
from it?
Upvotes: 1
Views: 632
Reputation: 21453
This post does NOT answer "This is how you do it." If you want to know how to do this yourself look at PEP 302 or Idan Arye's solution. This post instead presents a recipe that makes it easy to write. The recipe is at the end of this answer.
The block of code below defines two classes intended for use: PseudoModule
and PseudoPackage
. The behaviour only differs from whether import foo.x
should raise an error stating foo isn't a package
or try to load x
and make sure it's a module. Several example uses are outlined below.
PseudoModule
can be used as a decorator to a function, it creates a new module object that when attributes are accessed for the first time it called the decorated function with the name of the attribute and the namespace of previously defined elements.
For example, this will make a module that assigns a new integer to each attribute accessed:
@PseudoModule
def access_tracker(attr, namespace):
namespace["_count"] = namespace.get("_count", -1) + 1
return namespace["_count"]
#PseudoModule will set `namespace[attr] = <return value>` for you
#this can be overriden by passing `remember_results=False` to the constructor
sys.modules["access_tracker"] = access_tracker
from access_tracker import zero, one, two, three
assert zero == 0 and one == 1 and two == 2 and three == 3
PseudoPackage
is used the same way as PseudoModule
however if the decorated function returns a module (or package) it will correct the name to be qualified as a subpackage and sys.modules
is updated as needed. (the top level package still needs to be added to sys.modules
manually)
Here is an example use of PseudoPackage
:
spam_submodules = {"bacon"}
spam_attributes = {"eggs", "ham"}
@PseudoPackage
def spam(name, namespace):
print("getting a component of spam:", name)
if name in spam_submodules:
@PseudoModule
def submodule(attr, nested_namespace):
print("getting a component of submodule {}: {}".format(name, attr))
return attr #use the string of the attribute
return submodule #PseudoPackage will rename the module to be spam.bacon for us
elif name in spam_attributes:
return "supported attribute"
else:
raise AttributeError("spam doesn't have any {!r}.".format(name))
sys.modules["spam"] = spam
import spam.bacon
#prints "getting a component of spam: bacon"
assert spam.bacon.something == "something"
#prints "getting a component of submodule bacon: something"
from spam import eggs
#prints "getting a component of spam: eggs"
assert eggs == "supported attribute"
import spam.ham #ham isn't a submodule, raises error!
The way PseudoPackage
is setup also makes arbitrary depth packages very easy although this specific example doesn't accomplish much:
def make_abstract_package(qualname = ""):
"makes a PseudoPackage that has arbitrary nesting of subpackages"
def gen_func(attr, namespace):
print("getting {!r} from package {!r}".format(attr, qualname))
return make_abstract_package("{}.{}".format(qualname, attr))
#can pass the name of the module as second argument if needed
return PseudoPackage(gen_func, qualname)
sys.modules["foo"] = make_abstract_package("foo")
from foo.bar.baz import thing_I_want
##prints:
# getting 'bar' from package 'foo'
# getting 'baz' from package 'foo.bar'
# getting 'thing_I_want' from package 'foo.bar.baz'
print(thing_I_want)
#prints "<module 'foo.bar.baz.thing_I_want' from '<PseudoPackage>'>"
As general guidelines:
sys.modules
yourself.PseudoPackage
assumes each submodule is unique, don't reuse module objects.It is also worth noting that sys.modules
is only updated with submodules of PseudoPackage
s when an import statement that requires the name to be a module, for example if foo
is a package already in sys.modules
but foo.x
has not been referenced yet then all these assertions will pass:
assert "foo.x" not in sys.modules and not hasattr(foo,"x")
import foo; foo.x #foo.x is computed but not added to sys.modules
assert "foo.x" not in sys.modules and hasattr(foo,"x")
from foo import x #x is retrieved from namespace but sys.modules is still not affected
assert "foo.x" not in sys.modules
import foo.x #if x is a module then "foo.x" is added to sys.modules
assert "foo.x" in sys.modules
as well in the above case if foo.x
isn't a module then the statement import foo.x
raises a ModuleNotFoundError
.
Finally, while the problematic edge cases I have identified can be avoided by following the guidelines above, the docstring for _PseudoPackageLoader
describes the implementation details responsible for unwanted behaviour for possible future modifications.
import sys
from types import ModuleType
import importlib.abc #uses Loader and MetaPathFinder, more for inspection purposes then use
class RawPseudoModule(ModuleType):
"""
see PseudoModule for documentation, this class is not intended for direct use.
RawPseudoModule does not handle __path__ so the generating function of direct
instances are expected to make and return an appropriate value for __path__
*** if you do not know what an appropriate value for __path__ is
then use PseudoModule instead ***
"""
#using slots keeps these two variables out of the module dictionary
__slots__ = ["__generating_func", "__remember_results"]
def __init__(self, func, name=None, remember_results = True):
name = name or func.__name__
super(RawPseudoModule, self).__init__(name)
self.__file__ = "<{0.__class__.__name__}>".format(self)
self.__generating_func = func
self.__remember_results = remember_results
def __getattr__(self, attr):
value = self.__generating_func(attr, vars(self))
if self.__remember_results:
setattr(self, attr, value)
return value
class PseudoModule(RawPseudoModule):
"""
A module that has attributes generated from a specified function
The generating function passed to the constructor should have the signature:
f(attr:str, namespace:dict) -> object:
- attr is the name of the attribute accessed
- namespace is the currently defined values in the module
the function should return a value for the attribute or raise an AttributeError if it doesn't exist.
by default the result is then saved to the namespace so you don't
have to explicitly do "namespace[attr] = <value>" however this behaviour
can be overridden by specifying "remember_results = False" in the constructor.
If no name is specified in the constructor the function name will be
used for the module name instead, this allows the class to be used as a decorator
Note: the PseudoModule class is setup so that "import foo.bar"
when foo is a PseudoModule will fail stating "'foo' is not a package".
- to allow importing submodules use PseudoPackage.
- to handle the internal __path__ manually use RawPseudoPackage.
Note: the module is NOT added to sys.modules automatically.
"""
def __getattr__(self, attr):
#to not have submodules then __path__ must not exist
if attr == "__path__":
msg = "{0.__name__} is a PseudoModule, it is not a package so it doesn't have a __path__"
#this error message would only be seen by people who explicitly access __path__
raise AttributeError(msg.format(self))
return super(PseudoModule, self).__getattr__(attr)
class PseudoPackage(RawPseudoModule):
"""
A version of PseudoModule that sets itself up to allow importing subpackages
When a submodule is imported from a PseudoPackage:
- it is evaluated with the generating function.
- the name of the submodule is overriden to be correctly qualified
- and it is added to sys.modules to allow repeated imports.
Note: the top level package still needs to be added to sys.modules manually
Note: A RecursionError will be raised if the code that generates submodules
attempts to import another submodule from the PseudoPackage.
"""
#IMPLEMENTATION DETAIL: technically this doesn't deal with adding submodules to
# sys.modules, that is handled in _PseudoPackageLoader
# which explicitly checks for instances of PseudoPackage
__path__ = [] #packages must have a __path__ to be recognized as packages.
def __getattr__(self, attr):
value = super(PseudoPackage, self).__getattr__(attr)
if isinstance(value, ModuleType):
#I'm just going to say if it's a module then the name must be in this format.
value.__name__ = self.__name__ + "." + attr
return value
class _PseudoPackageLoader(importlib.abc.Loader, importlib.abc.MetaPathFinder):
"""
Singleton finder and loader for pseudo packages
When ever a subpackage of a PseudoPackage (that is already in sys.modules) is imported
this will handle loading it and adding the subpackage to sys.modules
Note that although PEP 302 states the finder should not depend on the parent
being loaded in sys.modules, this is implemented under the understanding that
the user of PseudoPackage will add their module to sys.modules manually themselves
so this will work only when the parent is present in sys.modules
Also PEP 302 indicates the module should be added to sys.modules first in case
it is imported during it's execution, however this is impossible due to the
nature of how the module actually gets loaded.
So for heaven's sake don't try to import a pseudo package or a module that uses
a pseudo package from within the code that generates it.
I have only tested this when the sub module is either PseudoModule or PseudoPackage
and it was created new from the generating function, ideally there would be a way
to allow the generating function to return an unexecuted module and this would
properly handle executing it but I don't know how to deal with that.
"""
def find_module(self, fullname, path):
#this will only support loading if the parent package is a PseudoPackage
base,_,_ = fullname.rpartition(".")
if isinstance(sys.modules.get(base), PseudoPackage):
return self
#I found that `if path is PseudoPackage.__path__` worked the same way for all the cases I tested
#however since load_module will fail if the base part isn't in sys.modules
# it seems safer to just check for that.
def load_module(self, fullname):
if fullname in sys.modules:
return sys.modules[fullname]
base,_,sub = fullname.rpartition(".")
parent = sys.modules[base]
try:
submodule = getattr(parent, sub)
except AttributeError:
#when we just access `foo.x` it raises an AttributeError
#but `import foo.x` should instead raise an ImportError
raise ImportError("cannot import name {!r}".format(sub))
if not isinstance(submodule, ModuleType):
#match the format of error raised when the submodule isn't a module
#example: `import sys.path` raises the same format of error.
raise ModuleNotFoundError("No module named {}".format(fullname))
#fill all the fields as described in PEP 302 except __name__
submodule.__loader__ = self
submodule.__package__ = base
submodule.__file__ = getattr(submodule, "__file__", "<submodule of PseudoPackage>")
#if there was a way to do this before the module was made that'd be nice
sys.modules[fullname] = submodule
#if we needed to execute the body of an unloaded module it'd be done here.
return submodule
#add the loader to sys.meta_path so it will handle our pseudo packages
sys.meta_path.append(_PseudoPackageLoader())
Upvotes: 3
Reputation: 12603
Thanks to the link @TadhgMcDonald-Jensen provided I managed to solve it:
import sys
from types import ModuleType
class FooImporter(object):
module = ModuleType('foo')
module.__path__ = [module.__name__]
def find_module(self, fullname, path):
if fullname == self.module.__name__:
return self
if path == [self.module.__name__]:
return self
def load_module(self, fullname):
if fullname == self.module.__name__:
return sys.modules.setdefault(fullname, self.module)
assert fullname.startswith(self.module.__name__ + '.')
try:
return sys.modules[fullname]
except KeyError:
submodule = ModuleType(fullname)
name = fullname[len(self.module.__name__) + 1:]
setattr(self.module, name, submodule)
sys.modules[fullname] = submodule
return submodule
sys.meta_path.append(FooImporter())
from foo import bar
@TadhgMcDonald-Jensen - please make an answer so that I can approve it.
Upvotes: 1