Reputation: 203
I was trying to run dll-library using Pooling in python and ran into the following problem. I've created a simple dll-library to illustrate the problem. Here's the source code of the dll-library, which contains only one function which sums two double numbers:
extern "C" {
double sum(double x, double y);
}
double sum(double x, double y) {
return x + y;
}
I compile it on a Linux system using
g++ -fPIC -c dll_main.cpp
g++ dll_main.o -shared -o sum_dll.so
I use this dll-library in the following Python script:
#!/usr/bin/env python3.8
from ctypes import *
import multiprocessing
from multiprocessing import Pool, freeze_support
def run_dll(dll_obj, x, y):
x_c = c_double(x)
y_c = c_double(y)
z = dll_obj.sum(x_c, y_c)
return z
def main():
pool = Pool(processes=2)
dll_obj = cdll.LoadLibrary('./sum_dll.so')
dll_obj.sum.restype = c_double
z = pool.map(run_dll, [(dll_obj, 2, 3), (dll_obj, 3, 4)])
pool.close()
pool.join()
if __name__ == "__main__":
freeze_support()
main()
I get the following error-message:
Traceback (most recent call last):
File "./run_dll.py", line 24, in <module>
main()
File "./run_dll.py", line 17, in main
z = pool.map(run_dll, [(dll_obj, 2, 3), (dll_obj, 3, 4)])
File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib/python3.8/multiprocessing/pool.py", line 768, in get
raise self._value
File "/usr/lib/python3.8/multiprocessing/pool.py", line 537, in _handle_tasks
put(task)
File "/usr/lib/python3.8/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/usr/lib/python3.8/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'CDLL.__init__.<locals>._FuncPtr'
What am I doing wrong? How to use dll-library with several processes in python properly?
Upvotes: 1
Views: 784
Reputation: 17636
You can't pickle dll pointers, because accessing the dlls requires a systemcall to link them to your process.
since you cannot pass dll functions as arguments, you need to wrap them in a python function, so you should define this function the same way you'd define a normal python function, in the global scope
from ctypes import *
import multiprocessing
from multiprocessing import Pool, freeze_support
dll_obj = cdll.LoadLibrary('./sum_dll.so')
dll_obj.sum.restype = c_double
def pyrun_dll(x, y): # looks for the dll in global scope
x_c = c_double(x)
y_c = c_double(y)
z = dll_obj.sum(x_c, y_c)
return z
def dll_sum(x_c, y_c): # pickleable wrapper
return dll_obj.sum(x_c, y_c)
def run_dll(py_sum, x, y): # arguments must be pickleable
x_c = c_double(x)
y_c = c_double(y)
z = py_sum(x_c, y_c)
return z
Then you call these python functions in your main
def main():
pool = Pool(processes=2)
z = pool.starmap(pyrun_dll, [(2, 3), (3, 4)])
z2 = pool.starmap(run_dll, [(dll_sum, 2, 3), (dll_sum, 3, 4)])
pool.close()
pool.join()
if __name__ == "__main__":
freeze_support()
main()
the way the above code executes depends on your operating system, but this will work on all platforms (assuming you change the dll name for each platform), because they will either fork the dll reference from the global scope or create one when they import your file.
Your original function will work if you pass dll_sum
to it, instead of the dll raw function handle.
If you need your dll to be loaded dynamically then you should have each process make the call to cdll to link to the dll itself (usually through the initializer), not get it through a function arguments.
Upvotes: 2