Daniel Stephens
Daniel Stephens

Reputation: 3209

How to extend Python and make a C-package?

I embedded and extended Python 2.7 in my C application a while ago. Late on the train I am bringing it to Python 3, and a lot of initializations for the module registration changed for me.

Before I used PyModule_Create to create the module and added the members afterwards, even sub-modules so I could execute:

from foo.bar import bas

I added/appended the 'top-level' module to PyEval_GetBuiltins(), which might have been wrong in Py 2, but it worked. Now in Py 3 I receive this exception on the code above:

Traceback (most recent call last):
  File "foo.py", line 1, in <module>
ModuleNotFoundError: No module named 'foo.bar'; 'foo' is not a package

Looking up the docs, I found now an example with PyImport_ExtendInittab. I have two questions regarding this:

1) What is Inittab supposed to mean? The doc says what it means, but this naming is slighly irritating. What is an Inittab? Shouldn't it be called PyImport_ExtendBuiltins, that I would understand.

2) I can only find examples where plain modules get added. Is creating a package with sub-modules possible with PyImport_ExtendInittab too?

Thanks a lot!

Upvotes: 2

Views: 472

Answers (3)

Luci Stanescu
Luci Stanescu

Reputation: 63

I'm one year late with this answer, but, having stumbled upon the same problem as the OP, I believe I have found a cleaner solution than the accepted answer.

I'm only going to cover Python 3, as this is what the OP was looking to address and, well, it's 2021.

The Problem

A built-in module, while following the same conventions as an extension module, would not be compiled into a shared library and distributed as a file – it makes more sense to do this when embedding Python into a larger application, as the module should not be accessible to a general-purpose Python application or to the interactive interpreter.

A built-in module is registered to the interpreter using PyImport_ExtendInittab, as the OP had found. However, if the name is nested (e.g. foo.bar.bas, instead of bas), the default import machinery will not work.

Issues with the Accepted Answer

The accepted answer loads the module and executes it as soon as it is registered to the interpreter (i.e. when the PyMODINIT_FUNC function is called). Subsequently importing the module from Python would simply return the object in sys.modules.

Additionally, this wouldn't work with the newer (and recommended) Multi-Phase Initialization, which has consequences for the ability to reload a module and using sub-interpreters.

The Cause of the Problem

The Python import machinery is very well documented. Any imported module (whether a shared-library-backed extension, built-in and registered via PyImport_ExtendInittab or pure Python) needs to be located by a MetaPathFinder registered in sys.meta_path. By default, built-in modules are located by importlib.machinery.BuiltinImporter (which happens to be a Loader, as well). However, its find_spec method is defined as:

    @classmethod
    def find_spec(cls, fullname, path=None, target=None):
        if path is not None:
            return None
        if _imp.is_builtin(fullname):
            return spec_from_loader(fullname, cls, origin=cls._ORIGIN)
        else:
            return None

A nested module (e.g. foo.bar.bas) is looked-up by calling the find_spec method using its parent's package __path__ attribute as the second argument (i.e. find_spec('foo.bar.bas', foo.bar.__path__).

This can be easily tested by setting up a pure Python parent package (e.g. foo/bar/__init__.py in the Python path) with:

__path__ = None

A built-in extension module named foo.bar.bas and registered via PyImport_ExtendInittab will then be importable.

This behaviour is somewhat documented:

Some meta path finders only support top level imports. These importers will always return None when anything other than None is passed as the second argument.

The Solution

The test above is a bit of a hack that depends on knowledge of implementation details and, anyway, can only be considered a solution if no non-built-in modules would be desired beneath foo.bar – a pure Python module named foo.bar.moo (i.e. defined in foo/bar/moo.py) would fail to import in this case.

A much cleaner solution is to define a MetaPathFinder, which also seems to be encouraged:

The most reliable mechanism for replacing the entire import system is to delete the default contents of sys.meta_path, replacing them entirely with a custom meta path hook.

Of course, we can retain the existing MetaPathFinders, simply extending the list. The following code defined in foo/bar/__init__.py (relying exclusively on documented and non-deprecated APIs, at the time of this writing) would do the trick:

import importlib.abc
import importlib.machinery
import importlib.util
import sys


class CustomBuiltinImporter(importlib.abc.MetaPathFinder):
    _ORIGIN = 'custom-builtin'

    @classmethod
    def find_spec(cls, fullname, path, target=None):
        if path != __path__ or not fullname.startswith(cls.__module__ + '.'):
            return None
        if fullname not in sys.builtin_module_names:
            return None
        return importlib.util.spec_from_loader(fullname, importlib.machinery.BuiltinImporter, origin=cls._ORIGIN)


sys.meta_path.append(CustomBuiltinImporter)

This code will not allow built-in modules defined under anything other than foo.bar to be loaded. Of course, a custom MetaPathFinder can be defined anywhere (including in some bootstrapping code of the application), but the first test of the find_spec method would need adapting. Such an implementation would also allow foo.bar to be a namespace package, thus providing even more flexibility for its contents.

Upvotes: 0

CristiFati
CristiFati

Reputation: 41116

I don't know if what you're trying to pull here (nested extension modules) is OK, anyway the recommended way for structuring code is via [Python 3.Docs]: Modules - Packages.
However, I did this (reproducing the problem, fixing it) as a personal exercise.

1. Intro

Listing the 2 relevant pages:

The environment:

[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q061692747]> tree /a /f
Folder PATH listing for volume SSD0-WORK
Volume serial number is AE9E-72AC
E:.
|   test00.py
|
+---py2
|       mod.c
|
\---py3
        helper.c
        mod.c


2. Python 2

Dummy module attempting to reproduce the behavior mentioned in the question.

mod.c:

#include <stdio.h>
#include <Python.h>

#define MOD_NAME "mod"
#define SUBMOD_NAME "submod"


static PyObject *pMod = NULL;
static PyObject *pSubMod = NULL;

static PyMethodDef modMethods[] = {
    {NULL}
};


PyMODINIT_FUNC initmod() {
    if (!pMod) {
        pMod = Py_InitModule(MOD_NAME, modMethods);
        if (pMod) {
            PyModule_AddIntConstant(pMod, "i", -69);
            pSubMod = Py_InitModule(MOD_NAME "." SUBMOD_NAME, modMethods);
            if (pSubMod) {
                PyModule_AddStringConstant(pSubMod, "s", "dummy");
                if (PyModule_AddObject(pMod, SUBMOD_NAME, pSubMod) < 0) {
                    Py_XDECREF(pMod);
                    Py_XDECREF(pSubMod);
                    return;
                }
            }
        }
    }
}

Output:

[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q061692747\py2]> sopr.bat
*** Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ***

[prompt]> "f:\Install\pc032\Microsoft\VisualCForPython2\2008\Microsoft\Visual C++ for Python\9.0\vcvarsall.bat" x64
Setting environment for using Microsoft Visual Studio 2008 x64 tools.

[prompt]> dir /b
mod.c

[prompt]> cl /nologo /MD /DDLL /I"c:\Install\pc064\Python\Python\02.07.17\include" mod.c  /link /NOLOGO /DLL /OUT:mod.pyd /LIBPATH:"c:\Install\pc064\Python\Python\02.07.17\libs"
mod.c
   Creating library mod.lib and object mod.exp

[prompt]> dir /b
mod.c
mod.exp
mod.lib
mod.obj
mod.pyd
mod.pyd.manifest

[prompt]> "e:\Work\Dev\VEnvs\py_pc064_02.07.17_test0\Scripts\python.exe"
Python 2.7.17 (v2.7.17:c2f86d86e6, Oct 19 2019, 21:01:17) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>>
>>> [item for item in sys.modules if "mod" in item]
[]
>>> import mod
>>>
>>> [item for item in sys.modules if "mod" in item]  # !!! NOTICE the contents !!!
['mod.submod', 'mod']
>>>
>>> mod
<module 'mod' from 'mod.pyd'>
>>> mod.i
-69
>>> mod.submod
<module 'mod.submod' (built-in)>
>>> mod.submod.s
'dummy'
>>>
>>> from mod.submod import s
>>> s
'dummy'
>>>

As seen, importing the module with submodules, adds the submodules in sys.path (didn't look, but I am 99.99% sure this is performed by Py_InitModule)


3. Python 3

Conversion to Python 3. Since this is the 1st step, treat the 2 commented lines as they were not there.

mod.c:

#include <stdio.h>
#include <Python.h>
//#include "helper.c"

#define MOD_NAME "mod"
#define SUBMOD_NAME "submod"


static PyObject *pMod = NULL;
static PyObject *pSubMod = NULL;

static PyMethodDef modMethods[] = {
    {NULL}
};

static struct PyModuleDef modDef = {
    PyModuleDef_HEAD_INIT, MOD_NAME, NULL, -1, modMethods,
};

static struct PyModuleDef subModDef = {
    PyModuleDef_HEAD_INIT, MOD_NAME "." SUBMOD_NAME, NULL, -1, modMethods,
};


PyMODINIT_FUNC PyInit_mod() {
    if (!pMod) {
        pMod = PyModule_Create(&modDef);
        if (pMod) {
            PyModule_AddIntConstant(pMod, "i", -69);
            pSubMod = PyModule_Create(&subModDef);
            if (pSubMod) {
                PyModule_AddStringConstant(pSubMod, "s", "dummy");
                if (PyModule_AddObject(pMod, SUBMOD_NAME, pSubMod) < 0) {
                    Py_XDECREF(pMod);
                    Py_XDECREF(pSubMod);
                    return NULL;
                }
                //addToSysModules(MOD_NAME "." SUBMOD_NAME, pSubMod);
            }
        }
    }
    return pMod;
}

Output:

[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q061692747\py3]> sopr.bat
*** Set shorter prompt to better fit when pasted in StackOverflow (or other) pages ***

[prompt]> "c:\Install\pc032\Microsoft\VisualStudioCommunity\2017\VC\Auxiliary\Build\vcvarsall.bat" x64
**********************************************************************
** Visual Studio 2017 Developer Command Prompt v15.9.23
** Copyright (c) 2017 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'

[prompt]> dir /b
helper.c
mod.c

[prompt]> cl /nologo /MD /DDLL /I"c:\Install\pc064\Python\Python\03.07.06\include" mod.c  /link /NOLOGO /DLL /OUT:mod.pyd /LIBPATH:"c:\Install\pc064\Python\Python\03.07.06\libs"
mod.c
   Creating library mod.lib and object mod.exp

[prompt]> dir /b
helper.c
mod.c
mod.exp
mod.lib
mod.obj
mod.pyd

[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe"
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>>
>>> [item for item in sys.modules if "mod" in item]
[]
>>> import mod
>>>
>>> [item for item in sys.modules if "mod" in item]  # !!! NOTICE the contents !!!
['mod']
>>>
>>> mod
<module 'mod' from 'e:\\Work\\Dev\\StackOverflow\\q061692747\\py3\\mod.pyd'>
>>> mod.i
-69
>>> mod.submod
<module 'mod.submod'>
>>> mod.submod.s
'dummy'
>>>
>>> from mod.submod import s
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'mod.submod'; 'mod' is not a package
>>> ^Z


[prompt]>

As seen, nested import is not possible. That is because mod.submod is not present in sys.modules. As a generalization, "nested" extension submodules are no longer made importable through the module that contains them initialization function. the only option is to import them manually.
As a note: I think this Python 3 restriction is there for a reason, so what comes below is like playing with fire.

Decomment the 2 lines from mod.c.

helper.c:

int addToSysModules(const char *pName, PyObject *pMod) {
    PyObject *pSysModules = PySys_GetObject("modules");
    if (!PyDict_Check(pSysModules)) {
        return -1;
    }
    PyObject *pKey = PyUnicode_FromString(pName);
    if (!pKey) {
        return -2;
    }
    if (PyDict_Contains(pSysModules, pKey)) {
        Py_XDECREF(pKey);
        return -3;
    }
    Py_XDECREF(pKey);
    if (PyDict_SetItemString(pSysModules, pName, pMod) == -1)
    {
        return -4;
    }
    return 0;
}

Output:

[prompt]> cl /nologo /MD /DDLL /I"c:\Install\pc064\Python\Python\03.07.06\include" mod.c  /link /NOLOGO /DLL /OUT:mod.pyd /LIBPATH:"c:\Install\pc064\Python\Python\03.07.06\libs"
mod.c
   Creating library mod.lib and object mod.exp

[prompt]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe"
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import sys
>>>
>>> [item for item in sys.modules if "mod" in item]
[]
>>> import mod
>>>
>>> [item for item in sys.modules if "mod" in item]  # !!! NOTICE the contents :) !!!
['mod.submod', 'mod']
>>>
>>> from mod.submod import s
>>> s
'dummy'
>>>


4. Closing notes

As I stated above, this seems more like a workaraound. A cleaner solution would be to better organize the modules via packages.

Since this is for demo purposes, and to keep the code as simple as possible, I didn't always check Python C API functions return codes. This can lead to hard to find errors (even crashes) and should never be done (especially in production code).

I am not very sure what PyImport_ExtendInittab effect really is as I didn't play with it, but [Python 3.Docs]: Importing Modules - int PyImport_ExtendInittab(struct _inittab *newtab) states (emphasis is mine):

This should be called before Py_Initialize().

So, calling it in our context, is out of the question.

Also mentioning this (old) discussion (not sure whether it contains relevant information, but still) [Python.Mail]: [Python-Dev] nested extension modules?.

Upvotes: 2

bug_spray
bug_spray

Reputation: 1506

Without a minimal reproducible example, it's hard to tell what's wrong and what specifically you're looking for in an answer. Nevertheless, I will attempt to provide some assistance.

from foo.bar import bas

For the above to work, you need a file bar.py in a folder called foo, and bar.py must contain a function bas(). Also, folder foo must contain an empty __init__.py file.

Now, if you want to call a compiled C file somewhere, then probably the easiest way to accomplish this would be to use os.system() or subprocess.call() and call the file as if you were calling it from the command line.

Assuming the make file is in the same directory:

import os
import subprocess

os.system("make run")

# or
subprocess.run("make run".split())

Where make run runs your C file as desired (declared in your makefile). Also feel free to pass in keyword arguments using python f-strings.

Hope this helps.

Upvotes: 0

Related Questions