Reputation:
So, I was learning about Python modules and according to my understanding when we try import
a module in our code, python looks if the module is present in sys.path
and if it is not then a ModuleNotFoundError
is raised.
sys.path
So, suppose I want to import from a location that does not exist in sys.path
by default, I can simply append this new location to sys.path
and everything works fine as shown in the snippet below.
~/Documents/python-modules/usemymodule.py
import sys
sys.path.append("/home/som/Documents/modules")
import mymodule
mymodule.yell("Hello World")
~/Documents/python-modules/modules/mymodule.py
def yell(txt):
print(f"{txt.upper()}")
sys.path
My doubt is when I clear the entire sys.path
list, then I should not be able to import any modules but to my surprise I can still import built-in modules. The code below works fine.
import sys
sys.path.clear()
import math
math.ceil(10.2)
I thought it could be possible that python internally doesn't use sys.path
, sys.path
is just a shallow copy of the original list that python uses, but then how does adding to sys.path
works and why is it that after clearing I can only import built-in modules and not custom modules.
I am really stuck, any help would be really nice. Also, there's a similar question to this but it doesn't answer my doubts.
Upvotes: 4
Views: 909
Reputation: 6004
CPython has a list of built-in modules like math that is defined in file PC/config.c and looks like this:
struct _inittab _PyImport_Inittab[] = {
{"_abc", PyInit__abc},
{"array", PyInit_array},
{"_ast", PyInit__ast},
{"audioop", PyInit_audioop},
{"binascii", PyInit_binascii},
{"cmath", PyInit_cmath},
...
};
So when it needs to import a built in module it looks inside this list instead. Each of the "PyInit" functions in the list returns an in-memory module object.
This list is then exposed as sys.builtin_module_names
, which is initialized in sysmodule.c. Then, the import code in importlib._bootstrap._find_spec
is called and goes over a list of import factories in sys.meta_path
. One of them is importlib._bootstrap.BuiltinImporter
, which is responsible for importing built-in modules. This demonstrates sys.meta_path
:
>>> import sys
>>> sys.modules['math']
<module 'math' (built-in)>
>>> sys.path.clear()
>>> import math # This works because math is in the module cache.
>>> del sys.modules['math']
>>> import math # This works because of BuiltinImporter in sys.meta_path!
>>> sys.meta_path.clear()
>>> import math # This still works because math is in the module cache.
>>> del sys.modules['math']
>>> import math # This fails because we cleared sys.meta_path!
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'math'
This was run on Python3.7 with Anaconda - may vary under different distributions.
I want to add that your test doesn't account for the module cache in sys.modules
. Consider this example with a non-builtin module:
>>> import requests
>>> import sys
>>> sys.path.clear()
>>> import requests # This works!
>>> del sys.modules['requests']
>>> import requests # This doesn't.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'requests'
Upvotes: 2
Reputation: 5831
I tried to reproduce your example and to my surprise did not have the same result (note: python3.9 here)
import sys
sys.path.clear()
import math
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'math'
However, this works:
import math
del math
import sys
sys.path.clear()
import math
# but removing the reference in sys.modules will break the import again
del sys.modules['math']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'math'
My guess is that the interpreter is keeping a reference to the math module from a previous import, and thus has no need to search for it in sys.path
Upvotes: 0