Michele Agostini
Michele Agostini

Reputation: 31

Python Import Module fails using C API if more than one subinterpreter import a module that imports some specific modules

I wrote a single-threaded C++ code using Python C APIs where I start two subinterpreters Sub1 and Sub2, then I load a module M with both. If M imports certain modules (such as urllib.request or yaml), importing M with Sub2 triggers a memory corruption error if Sub1 already imported it. It works instead if M imports some other modules (such as urllib, base64, os or sys).

Here is a snippet that reproduces the error:

    Py_Initialize();

    PyThreadState *tstate_main, *tstate_s1, *tstate_s2;
    
    tstate_main = PyThreadState_Get();
    std::cerr << "tstate_main: " << tstate_main << std::endl;

    //PyGILState_STATE gstate = PyGILState_Ensure();
    PyInterpreterConfig config_s1 = {
        .use_main_obmalloc = 0,
        .allow_fork = 0,
        .allow_exec = 0,
        .allow_threads = 0,
        .allow_daemon_threads = 0,
        .check_multi_interp_extensions = 1,
        .gil = PyInterpreterConfig_OWN_GIL,
    };
    tstate_s1 = NULL;
    PyStatus status_s1 = Py_NewInterpreterFromConfig(&tstate_s1, &config_s1);
    std::cerr << "tstate_s1: " << tstate_s1 << std::endl;
    std::string sysPathCmd1 = "import sys\nsys.path.append('" + cwd + "')";
    PyRun_SimpleString(sysPathCmd1.c_str());
    PyObject* bytecode1 = Py_CompileString(module_code1, "test_module1", Py_file_input);
    PyObject* pModule1 = PyImport_ExecCodeModule("test_module1", bytecode1);
    if(!pModule1) {
        std::cerr << "Error on module import:" << std::endl;
        PyErr_Print();
        return -1;
    }
    PyObject* pFunc1 = PyObject_GetAttrString(pModule1, "test_call");
    PyObject* pData1 = PyUnicode_FromString("hello");
    PyObject* pArgs1 = PyTuple_Pack(1, pData1);
    PyObject* pResult1 = PyObject_CallObject(pFunc1, pArgs1);

    PyEval_RestoreThread(tstate_main);

    //PyGILState_STATE gstate = PyGILState_Ensure();
    PyInterpreterConfig config_s2 = {
        .use_main_obmalloc = 0,
        .allow_fork = 0,
        .allow_exec = 0,
        .allow_threads = 0,
        .allow_daemon_threads = 0,
        .check_multi_interp_extensions = 1,
        .gil = PyInterpreterConfig_OWN_GIL,
    };
    tstate_s2 = NULL;
    PyStatus status_s2 = Py_NewInterpreterFromConfig(&tstate_s2, &config_s2);
    std::cerr << "tstate_s2: " << tstate_s2 << std::endl;
    std::string sysPathCmd2 = "import sys\nsys.path.append('" + cwd + "')";
    PyRun_SimpleString(sysPathCmd2.c_str());
    PyObject* bytecode2 = Py_CompileString(module_code2, "test_module2", Py_file_input);
    PyObject* pModule2 = PyImport_ExecCodeModule("test_module2", bytecode2);
    PyObject* pFunc2 = PyObject_GetAttrString(pModule2, "test_call");
    PyObject* pData2 = PyUnicode_FromString("");
    PyObject* pArgs2 = PyTuple_Pack(1, pData2);
    PyObject* pResult2 = PyObject_CallObject(pFunc2, pArgs2);
    
    PyEval_RestoreThread(tstate_main);

Here is the example Python code I used:

# import yaml # ERROR
# import urllib.request # ERROR
# import urllib # NO ERROR
# import os # NO ERROR
# import os.path # NO ERROR
# import base64 # NO ERROR

def test_call(data):
    print("Called")

Upvotes: 3

Views: 184

Answers (1)

Michele Agostini
Michele Agostini

Reputation: 31

Solved: PyGILState_Ensure() is needed with a single-threaded PyInterpreterConfig_OWN_GIL as well. I wasn't owning the (interpreter's) GIL when calling Import and this led to the memory corruption error.

Upvotes: 0

Related Questions