Tarantula
Tarantula

Reputation: 19902

Python multiprocessing Queue put() behavior

I'm doing something very simple using multiprocessing:

data = {'a': 1}
queue.put(data, True)
data.clear()

When I use the queue on another process (using get() method), I get an empty dictionary. If I remove data.clear() I get the keys as expected. Is there any way to wait for the put() to have finished the serialization ?

Upvotes: 10

Views: 10318

Answers (2)

fnl
fnl

Reputation: 5301

Actually, this is thought to be a feature, not a problem. The queue immediately returns so your process continues while serialization happens and to avoid what is known as "queue contention".

The two options I suggest you have:

  1. Are you absolutely sure you need mutable dictionaries in the first place? Instead of making defensive copies of your data, which you correctly seem to dislike, why not just create a new dictionary instead of using dict.clear() and let the garbage collector worry about old dictionaries?

  2. Pickle the data yourself; That is: a_queue.put(pickle.dumps(data)) and pickle.loads(a_queue.get()). Now, if you do data.clear() just after a put, the data has already been serialized "by you".

From a parallel programming point of view the first approach (treat your data as if it were immutable) is the more viable and clean thing to do on the long term, but I am not sure if or why you must clear your dictionaries.

Upvotes: 13

Tom Hunt
Tom Hunt

Reputation: 958

The best way is probably to make a copy of data before sending it. Try:

data = {'a': 1}
dc = data.copy()
queue.put(dc)
data.clear()

Basically, you can't count on the send finishing before the dictionary is cleared, so you shouldn't try. dc will be garbage-collected when it goes out of scope or when the code is executed again.

Upvotes: 4

Related Questions