Reputation: 1301
I have written a C++ library which runs machine learning inference. These inference functions generally use 8 threads each to optimize for low latency (though it can still take upwards of 100ms to complete the inference call). I have used synchronization primitives such as std::mutex
to ensure the library is threadsafe.
I have also written python bindings for this library using pybind11.
The current problem is that on a 16 core / thread CPU, it is optimal to run two inference function calls in parallel, thus utilizing all 16 available threads. This is fine in C++, but in python (with threading), due to the GIL, one inference call ends up holding the lock and the other cannot run in parallel. Note, I know I can solve this problem using multi processing, but I'd like to use threading instead due to the design of my library.
Can I therefore release the GIL when making the call to the C++ API function (in the python binding layer) since the C++ library implements the necessary thread safety? Are there any other considerations I need to make when releasing the GIL?
Upvotes: 1
Views: 516
Reputation: 40013
The GIL protects every Python object. If you aren’t accessing any, or it’s read-only access to an immutable object for which you have an owning reference, then it’s fine to release it. Do be sure to keep an owning reference to anything you’ll use after reacquiring the lock, lest it have been destroyed while it was released.
Upvotes: 1