thyme
thyme

Reputation: 480

multiprocessing.Process subclass using shared queue

I'm using the multiprocessing module on Python 2.7 on Windows and I have multiple processes putting data on and taking data off of shared queues. I'm subclassing multiprocessing.Process to do this, and passing in queue proxies made by a multiprocessing.Manager() as arguments to __init__. In other answers on SO I've seen people pass this queue proxy into map_async as an argument, but when I try to give it as an argument to the __init__ method I get the error:

TypeError: Pickling an AuthenticationString object is disallowed for security reasons

So I understand that on Windows things you pass to instantiate Process subclasses have to be pickle-able, and that there needs to be an authkey on these shared objects (which prevents pickling). But why can you give that queue proxy to map_async and not a Process subclass? Is there any good way around this besides rewriting my Process subclasses as functions?

Upvotes: 2

Views: 3310

Answers (1)

Silas Ray
Silas Ray

Reputation: 26150

map_async isn't equivalent to initializing/running a Process with the given arguments, apply_async is. map_async takes an iterable, in your example a Manager.Queue, splits it up in to batches (effectively unpacking/repacking it in to a series of tuples), then starts the workers and hands off the batches, not the original iterable. apply_async, or just starting the process directly, actually passes the exact object you provide in to the workers as one of their arguments. You also happen to be using a proxy Queue from the Manager, and all proxies produced by Managers have an AuthenticationString member attached to them, which, as your error states, is unpicklable for security reasons, so can't be propagated down to the workers.

Is there a reason you are using a Manager to produce your Queue? Unless you are using it over a network or something, you should be just fine using a standard multiprocessing.Queue, which won't have the picklability issue (since that derives from being a Manager proxy).

Incidentally, at least from my reading of the code, it looks like using a Manager.Queue, or even a regular old multiprocessing.Queue or Queue as an input to map_async is fairly pointless, as the iterable being mapped is completely consumed in the parent process before any workers are even created, then never looked at again.

Upvotes: 3

Related Questions