Reputation: 842
I'm trying to use the multiprocessing.Pool
object to run some database queries in parallel. I'm using MySQLdb.
I have some module-level functions where I define queries to run, like this:
def check_foo(cursor, table):
query = "(some query)"
cursor.execute(query)
results = cursor.fetchall()
return len(results) == 0
These functions are collected when the program is run, like this:
if __name__ == '__main__':
check_functions = [v for k, v in globals().items()
if k.startswith('check_') and callable(v)]
I also have a module-level function that runs a particular check function on a list of tables:
def run_check_on_all((tables, cursor, f)):
return [f(cursor, table) for table in tables]
I want to have one worker process for each check function that will call run_check_on_all
for that function. Here's my attempt to do that:
if __name__ == '__main__':
...
pool = multiprocessing.Pool(len(check_functions))
cursors = [conn.cursor() for i in range(len(check_functions))]
print "Running {0} check(s)...".format(len(check_functions))
table_lists = [table_list] * len(check_functions)
all_results = pool.map(run_check_on_all, zip(table_lists, cursors, check_functions))
When I attempt to run this, I get the following error:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/Python2.6/lib/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File "/usr/local/Python2.6/lib/python2.6/threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/Python2.6/lib/python2.6/multiprocessing/pool.py", line 225, in _handle_tasks
put(task)
PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed
As you can (hopefully) see, nothing involved in the call to pool.map
is an instance method. run_check_on_all
and each of the check_functions
are module-level functions. table_lists
is a list of lists of strings. cursors
is a list of MySQLdb cursor objects.
I thought maybe it had to do with calling the cursor objects' instance methods in the check functions, but I replaced them with dummy functions like this
def check_foo(cursor, table):
print "hello"
and still no luck.
Where is the instance method that the error is referring to?
Upvotes: 1
Views: 733
Reputation: 25197
The problem is that you attempt to pass database cursor objects between processes. Each process must create a connection to the database, and create a cursor on that connection.
Upvotes: 1