Reputation: 8246
I really want to get this right because I keep running into it when generating some big py2app/py2exe packages. I have my package that contains a lot of modules/packages that might also be in the users site packages/default location (if a user has a python distribution) but I want my distributed packages to take effect before them when running from my distribution.
Now from what I've read here PYTHONPATH
should be the first thing added to sys.path
after the current directory, however from what I've tested on my machine that is not the case and all the folders defined in $site-packages$/easy-install.pth
take precedence over this.
Could someone please give me some more in-depth explanation about this import order and help me find a way to set the environmental variables in such a way that the packages I distribute take precedence over the default installed ones?
So far my attempt is, for example on Mac-OS py2app, in my entry point script:
os.environ['PYTHONPATH'] = DATA_PATH + ':'
os.environ['PYTHONPATH'] = os.environ['PYTHONPATH'] + os.path.join(DATA_PATH
, 'lib') + ':'
os.environ['PYTHONPATH'] = os.environ['PYTHONPATH'] + os.path.join(
DATA_PATH, 'lib', 'python2.7', 'site-packages') + ':'
os.environ['PYTHONPATH'] = os.environ['PYTHONPATH'] + os.path.join(
DATA_PATH, 'lib', 'python2.7', 'site-packages.zip')
This is basically the structure of the package generated by py2app. Then I just:
SERVER = subprocess.Popen([PYTHON_EXE_PATH, '-m', 'bin.rpserver'
, cfg.RPC_SERVER_IP, cfg.RPC_SERVER_PORT],
shell=False, stdin=IN_FILE, stdout=OUT_FILE,
stderr=ERR_FILE)
Here PYTHON_EXE_PATH
is the path to the python executable that is added by py2app to the package. This works fine on a machine that doesn't have a python installed. However, when python distribution is already present, its site-packages take precedence.
Upvotes: 37
Views: 68082
Reputation: 4589
This page is a high Google result for "Python import order", so here's a hopefully clearer explanation:
As both of those pages explain, the import
order is:
sys.modules
.sys.path
entries.And as the sys.path
doc page explains, it is populated as follows:
python
was started with (so /someplace/on/disk/> $ python /path/to/the/run.py
means the first path is /path/to/the/
, and likewise the path would be the same if you're in /path/to/> $ python the/run.py
(it is still ALWAYS going to be set to the FULL PATH to the directory no matter if you gave python a relative or absolute file)), or it will be an empty string if python was started without a file aka interactive mode (an empty string means "current working directory for the python process"). In other words, Python assumes that the file you started wants to be able to do relative imports of package/-folders
and blah.py
modules that exist within the same location as the file you started python with.sys.path
are populated from the PYTHONPATH
environment variable. Basically your global pip folders where your third-party python packages are installed (things like requests
and numpy
and tensorflow
).So, basically: Yes, you can trust that Python will find your local package-folders and module files first, before any globally installed pip stuff.
Here's an example to explain further:
myproject/ # <-- This is not a package (no __init__.py file).
modules/ # <-- This is a package (has an __init__.py file).
__init__.py
foo.py
run.py
second.py
executed with: python /path/to/the/myproject/run.py
will cause sys.path[0] to be "/path/to/the/myproject/"
run.py contents:
import modules.foo as foo # will import "/path/to/the/myproject/" + "modules/foo.py"
import second # will import "/path/to/the/myproject/" + "second.py"
second.py contents:
import modules.foo as foo # will import "/path/to/the/myproject/" + "modules/foo.py"
EDIT:
You can run the following command to print a sorted list of all built-in module names. These are the things that load before ANY custom files/module folders in your projects. Basically these are names you must avoid in your own custom files:
python -c "import sys, json; print(json.dumps(sorted(list(sys.modules.keys())), indent=4))"
List as of Python 3.9.0:
"__main__",
"_abc",
"_bootlocale",
"_codecs",
"_collections",
"_collections_abc",
"_frozen_importlib",
"_frozen_importlib_external",
"_functools",
"_heapq",
"_imp",
"_io",
"_json",
"_locale",
"_operator",
"_signal",
"_sitebuiltins",
"_sre",
"_stat",
"_thread",
"_warnings",
"_weakref",
"abc",
"builtins",
"codecs",
"collections",
"copyreg",
"encodings",
"encodings.aliases",
"encodings.cp1252",
"encodings.latin_1",
"encodings.utf_8",
"enum",
"functools",
"genericpath",
"heapq",
"io",
"itertools",
"json",
"json.decoder",
"json.encoder",
"json.scanner",
"keyword",
"marshal",
"nt",
"ntpath",
"operator",
"os",
"os.path",
"pywin32_bootstrap",
"re",
"reprlib",
"site",
"sre_compile",
"sre_constants",
"sre_parse",
"stat",
"sys",
"time",
"types",
"winreg",
"zipimport"
So NEVER use any of those names for you .py
files or your project module subfolders.
Upvotes: 21
Reputation: 721
Even though the above answers regarding the order in which the interpreter scans sys.path
are correct, giving precedence to e.g. user file paths over site-packages
deployed packages might fail if the full user path is not available in the PYTHONPATH
variable.
For example, imagine you have the following structure of namespace packages:
/opt/repo_root
- project # this is the base package that brigns structure to the namespace hierarchy
- my_pkg
- my_pkg-core
- my_pkg-gui
- my_pkg-helpers
- my_pkg-helpers-time_sync
The above packages all have the internal needed structure and metadata in order to be deployable by conda, and these are also all installed. Therefore, I can open a python shell and type:
>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)
/python/interpreter/path/lib/python3.6/site_packages/project/my_pkg/helpers/time_sync/__init__.py
will return some path in the python interpreter's site-packages
subfolder. If I manually add the package to be imported to PYTHONPATH
or even to sys.path
, nothing will change.
>>> import os
>>> # joining separator ":" for Unix, ";" for NT
>>> os.environ['PYTHONPATH'] = ":".join(os.environ['PYTHONPATH'], "/opt/repo_root/my_pkg-helpers-time_sync")
>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)
/python/interpreter/path/lib/python3.6/site_packages/project/my_pkg/helpers/time_sync/__init__.py
still returns that the package has been imported from site-packages
. You need to include the whole hierarchy of paths into PYTHONPATH
, as if it was a traditional python package, and then it will work as you expect:
>>> import os
>>> # joining separator ":" for Unix, ";" for NT
>>> os.environ['PYTHONPATH'] = ":".join(
... os.environ['PYTHONPATH'],
... "/opt/repo_root",
... "/opt/repo_root/project",
... "/opt/repo_root/project/my_pkg",
... "/opt/repo_root/project/my_pkg-helpers",
... "/opt/repo_root/project/my_pkg-helpers-time_sync"
... )
>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)
/opt/project/my_pkg/helpers/time_sync/__init__.py
Upvotes: 1
Reputation: 1366
after importing a module, python first searches from sys.modules
list of directories.
if it is not found, then it searches from sys.path
list of directories. There might be other lists python search for on your operating system
import time , sys
print (sys.modules)
print (sys.path)
output is lists of directories:
{... , ... , .....}
['C:\\Users\\****', 'C:\\****', ....']
time
module is imported in accordance with the order of sys.modules
and sys.path
lists.
Upvotes: 1
Reputation: 1142
Python searches the paths in sys.path
in order (see http://docs.python.org/tutorial/modules.html#the-module-search-path). easy_install changes this list directly (see the last line in your easy-install.pth file):
import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)
This basically takes whatever directories are added and inserts them at the beginning of the list.
Also see Eggs in path before PYTHONPATH environment variable.
Upvotes: 25