Jay
Jay

Reputation: 2868

How does pip tell Python how to import C extensions

I wish to use the sysv_ipc library in a portable manner.

I installed it with:

pip3 install sysv_ipc

Then from Python:

import sysv_ipc
sysv_ipc.__file__

# Output:
# /home/x/.local/lib/python3.9/site-packages/sysv_ipc.cpython-39-x86_64-linux-gnu.so

If I copy that file to a folder, pip uninstall the library, then open python from that folder and try the same import, it fails.

I tried to check what else was installed, and found:

/home/x/.local/lib/python3.9/site-packages/sysv_ipc-1.1.0.dist-info
/home/x/.local/lib/python3.9/site-packages/sysv_ipc.cpython-39-x86_64-linux-gnu.so
/home/x/.local/lib/python3.9/site-packages/sysv_ipc-1.1.0.dist-info/INSTALLER
/home/x/.local/lib/python3.9/site-packages/sysv_ipc-1.1.0.dist-info/LICENSE
/home/x/.local/lib/python3.9/site-packages/sysv_ipc-1.1.0.dist-info/METADATA
/home/x/.local/lib/python3.9/site-packages/sysv_ipc-1.1.0.dist-info/RECORD
/home/x/.local/lib/python3.9/site-packages/sysv_ipc-1.1.0.dist-info/REQUESTED
/home/x/.local/lib/python3.9/site-packages/sysv_ipc-1.1.0.dist-info/WHEEL
/home/x/.local/lib/python3.9/site-packages/sysv_ipc-1.1.0.dist-info/top_level.txt

I didn't find clues inside setup.py either.

What I would like to figure out is -

How/where does pip relate to Python that sysv_ipc is to be imported from that specific file?

Upvotes: 2

Views: 1128

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121724

Pip plays no role in how Python handles extension module imports. All that Python needs is the extension module file itself, provided it is in a format supported by your current OS, and that the file is in a directory on your sys.path search path.

Pip is only responsible for making sure the files that make up a project distribution end up in a sys.path location. The .dist-info directory you found is part of the package metadata, used by pip and importlib.metadata for things like uninstalling, dependency tracking and reporting. These files are not used when importing.

You haven’t shared exactly how you tried to import the extension module or how this failed so I can’t comment on what went wrong for you.

But when things work correctly, importing a module from a dynamically loaded shared object library works a lot like importing regular modules:

  1. Python searches through all directories on the sys.path list for files and directories matching the imported name, using a PathFinder object. It knows to look for extension modules based on the file extension (the file extensions supported depend on your OS, see importlib.machinery.EXTENSION_SUFFIXES for a list).
  2. If a file with an extension suffix is found that matches the imported name then the importlib.machinery.ExtensionFileLoader class is used to load the library.

Loading means: using an OS-dependent dynamic loading function to load the code in the file and then to access an entry point function (usually PyInit_<modulename>) to get the module namespace. See the documentation on creating extension modules. For .so files the Python/dynload_shlib.c file implements the loader but there are other dynload_ implementations in the same directory. To load an .so file Python passes the file path (containing at least one / slash) to the dlopen() function.

As to what may have gone wrong in your case: you used a different Python interpreter from the one used to install the project with. Note that the extension module filename includes a string after the module name that identifies the Python ABI (Application Binary Interface):

sysv_ipc.cpython-39-x86_64-linux-gnu.so
######## ^^^^^^^^^^^^^^^^^^^^^^^^^^^
module   ABI identifier

The identifier makes it possible to install extension files for multiple Python versions into the same directory. Do check what extensions your specific Python binary accepts by looking at importlib.machinery.EXTENSION_SUFFIXES:

$ python3 -c "from importlib.machinery import EXTENSION_SUFFIXES;print(EXTENSION_SUFFIXES)"
['.cpython-39-x86_64-linux-gnu.so', '.abi3.so', '.so']

The output tells me this interpreter will only look for sysv_ipc.cpython-39-x86_64-linux-gnu.so, sysv_ipc.abi3.so, and sysv_ipc.so file names to load.

A given Python version supports specific exported C functions that an extension module may want to make use of, and the ABI tells you what version it was compiled against. Extensions that use the short abi3.so suffix are compiled against the stable ABI, a smaller subset of Python functionality that is guaranteed to exist across many Python releases.

While you can rename an extension file to only use shortest suffix ([module_name].so), it’ll depend heavily on what Python functionality the dynamically loaded machine code calls into if it will still work on a different Python version.

Here is a quick demo showing that you can just import the sysv_ipc dynamic library from an arbitrary directory, provided I use the right Python version:

$ virtualenv /demo
... creating a virtualenv ...
done.
$ cd /demo
demo/ $ source bin/activate
(demo) /demo/ $ pip install sysv_ipc
Collecting sysv_ipc
... installing ...
Successfully installed sysv-ipc-1.1.0
(demo) /demo/ $ mkdir newdir
(demo) /demo/ $ cp lib/python3.9/site-packages/sysv_ipc.cpython-39-x86_64-linux-gnu.so newdir
(demo) /demo/ $ pip uninstall -y sysv_ipc
Found existing installation: sysv-ipc 1.1.0
... uninstalling ...
  Successfully uninstalled sysv-ipc-1.1.0
(demo) /demo/ $ cd newdir/
(demo) /demo/newdir/ $ python
Python 3.9.2 (default, Mar 15 2021, 17:53:50)
[Clang 7.0.1 (tags/RELEASE_701/final)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sysv_ipc
>>> sysv_ipc.__file__
'/demo/newdir/sysv_ipc.cpython-39-x86_64-linux-gnu.so'

Upvotes: 6

Related Questions