Reputation: 107
I have a Python module directory structure like this:
my_module
|--__init__.py
|--public_interface
| |--__init__.py
| |--my_sub_module
| | |--__init__.py
| | |--code.py
| |--some_more_code.py
|--other directories omitted
Now, the public_interface
directory (among several others) is only there to organize the code into logical sub-units, as a guideline for me and other developers. The eventual user of my_module
shall only see it as my_module.my_sub_module
without the public_interface
in-between.
I wrote these __init__.py
files:
my_module.__init__.py
:from .public_interface import *
and
my_module.public_interface.__init__.py
:from . import my_sub_module from .some_more_code import *
and
my_module.public_interface.my_sub_module.__init__.py
:from .code import *
This works fine as long as the user imports only the top-level module:
import my_module
my_module.my_sub_module.whatever # Works as intended
However, this does not work:
from my_module import my_sub_module
nor:
import my_module.my_sub_module
What would I have to change to make these last two imports work?
Upvotes: 2
Views: 115
Reputation: 155604
The import system only allows actual packages and modules to be imported directly as part of the dotted module name, but your:
from .public_interface import *
hack just makes my_sub_module
an attribute of the my_module
package, not an actual submodule for the purposes of the import system. It breaks for the same reason doing:
from collections._sys import *
breaks; yes, as an implementation detail, the collections
package happens to import sys
aliased to _sys
, but that doesn't actually make _sys
a subpackage of collections
, it's just one of many attributes on the collections
package. From the import machinery's point of view, my_sub_module
is no more a submodule of my_module
than _sys
is of collections
; the fact that nested in a sub-directory under my_module
is irrelevant.
That said, the import system provides a hook to allow you to treat additional arbitrary directories as being part of package, the __path__
attribute. By default, __path__
just includes the path to the package itself (so my_module
's __path__
defaults to ['/absolute/path/to/my_module']
), but you can programmatically manipulate it however you want; when resolving submodules, it will search only through the final contents of __path__
, much like importing top level modules searches sys.path
. So to resolve your particular case (wanting all packages/modules in public_interface
to be importable without specifying public_interface
in the import line), just change your my_module/__init__.py
file to have the following contents:
import os.path
__path__.append(os.path.join(os.path.dirname(__file__), 'public_interface'))
All that does is tell the import system that, when import mymodule.XXXX
occurs (XXXX
is a placeholder for a real name), if it can't find my_module/XXXX
or my_module/XXXX.py
, it should look for my_module/public_interface/XXXX
or my_module/public_interface/XXXX.py
. If you want it to search public_interface
first, change it to:
__path__.insert(0, os.path.join(os.path.dirname(__file__), 'public_interface'))
or to have it only check public_interface
(so nothing directly under my_module
is importable at all), use:
__path__[:] = [os.path.join(os.path.dirname(__file__), 'public_interface')]
to replace the contents of __path__
entirely.
Side-note: You might wonder why os.path
is an exception to this rule; on CPython, os
is a plain module with an attribute path
(which happens to be the module posixpath
or ntpath
depending on platform), yet you can do import os.path
. This works because the os
module, while being imported, explicitly (and hackily) populates the sys.modules
cache for os.path
. This isn't normal, and it has a performance cost; import os
must always import os.path
implicitly, even if nothing from os.path
is ever used. __path__
avoids that problem; nothing is imported unless requested.
You could achieve the same result by making my_module/__init__.py
contain:
import sys
from .public_interface import my_sub_module
sys.modules['my_module.my_sub_module'] = my_sub_module
which would allow people to use my_module.my_submodule
having only done import my_module
, but that would force any import
of my_module
to import public_interface
and my_sub_module
, even if nothing from my_sub_module
is ever used. os.path
continues to do it for historical reasons (using os.path
APIs with only import os
a long time ago, and a lot of code relies on that misbehavior because programmers are lazy and it worked), but new code shouldn't use this hack.
Upvotes: 1