jobo3208
jobo3208

Reputation: 842

Can't pickle instance method using Pool.map(), but I have no instance method

I'm trying to use the multiprocessing.Pool object to run some database queries in parallel. I'm using MySQLdb.

I have some module-level functions where I define queries to run, like this:

def check_foo(cursor, table):
    query = "(some query)"
    cursor.execute(query)
    results = cursor.fetchall()
    return len(results) == 0

These functions are collected when the program is run, like this:

if __name__ == '__main__':
    check_functions = [v for k, v in globals().items()
                             if k.startswith('check_') and callable(v)]

I also have a module-level function that runs a particular check function on a list of tables:

def run_check_on_all((tables, cursor, f)):
    return [f(cursor, table) for table in tables]

I want to have one worker process for each check function that will call run_check_on_all for that function. Here's my attempt to do that:

if __name__ == '__main__':
    ...

    pool = multiprocessing.Pool(len(check_functions))
    cursors = [conn.cursor() for i in range(len(check_functions))]

    print "Running {0} check(s)...".format(len(check_functions))
    table_lists = [table_list] * len(check_functions)
    all_results = pool.map(run_check_on_all, zip(table_lists, cursors, check_functions))

When I attempt to run this, I get the following error:

Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/local/Python2.6/lib/python2.6/threading.py", line 532, in __bootstrap_inner
    self.run()
  File "/usr/local/Python2.6/lib/python2.6/threading.py", line 484, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/local/Python2.6/lib/python2.6/multiprocessing/pool.py", line 225, in _handle_tasks
    put(task)
PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed

As you can (hopefully) see, nothing involved in the call to pool.map is an instance method. run_check_on_all and each of the check_functions are module-level functions. table_lists is a list of lists of strings. cursors is a list of MySQLdb cursor objects.

I thought maybe it had to do with calling the cursor objects' instance methods in the check functions, but I replaced them with dummy functions like this

def check_foo(cursor, table):
    print "hello"

and still no luck.

Where is the instance method that the error is referring to?

Upvotes: 1

Views: 733

Answers (1)

Janne Karila
Janne Karila

Reputation: 25197

The problem is that you attempt to pass database cursor objects between processes. Each process must create a connection to the database, and create a cursor on that connection.

Upvotes: 1

Related Questions