Reputation: 19902
I'm doing something very simple using multiprocessing:
data = {'a': 1}
queue.put(data, True)
data.clear()
When I use the queue on another process (using get()
method), I get an empty dictionary. If I remove data.clear()
I get the keys as expected. Is there any way to wait for the put()
to have finished the serialization ?
Upvotes: 10
Views: 10318
Reputation: 5301
Actually, this is thought to be a feature, not a problem. The queue immediately returns so your process continues while serialization happens and to avoid what is known as "queue contention".
The two options I suggest you have:
Are you absolutely sure you need mutable dictionaries in the first place? Instead of making defensive copies of your data, which you correctly seem to dislike, why not just create a new dictionary instead of using dict.clear()
and let the garbage collector worry about old dictionaries?
Pickle the data yourself; That is: a_queue.put(pickle.dumps(data))
and pickle.loads(a_queue.get())
. Now, if you do data.clear()
just after a put
, the data has already been serialized "by you".
From a parallel programming point of view the first approach (treat your data as if it were immutable) is the more viable and clean thing to do on the long term, but I am not sure if or why you must clear your dictionaries.
Upvotes: 13
Reputation: 958
The best way is probably to make a copy of data
before sending it. Try:
data = {'a': 1}
dc = data.copy()
queue.put(dc)
data.clear()
Basically, you can't count on the send finishing before the dictionary is cleared, so you shouldn't try. dc
will be garbage-collected when it goes out of scope or when the code is executed again.
Upvotes: 4