Reputation: 1637
Actually I am not sure whether the title describes the problem appropriately. Let me show the code.
import os
from multiprocessing import JoinableQueue
# A dict-like class, but is able to be accessed by attributes.
# example: d = AttrDict({'a': 1, 'b': 2})
# d.a is equivalent to d['a']
class AttrDict(dict):
def __init__(self, *args, **kwargs):
super(AttrDict, self).__init__(*args, **kwargs)
self.__dict__ = self
queue = JoinableQueue()
pid = os.fork()
if pid == 0:
d = AttrDict({'a': 1, 'b': 2})
queue.put(d)
queue.join()
os._exit(0)
else:
d = queue.get()
queue.task_done()
#d = AttrDict(d.items()) #(1)
d.a = 3 #(2)
#d['a'] = 3 #(3)
print d
The above code prints {'a': 1, 'b': 2}
, which means (2) is not taking any effect.
If I change (2) to (3), or enable (1), then the output is {'a': 3, 'b': 2}
, which is expected.
Seems something happened to d
when it is passed through queue.
Tested with Python 2.7.
Solution:
As pointed out by @kindall and @Blckknght, the reason is that d
is picked as a dict and when it is unpickled by queue.get()
, the self.__dict__ = self
magic is not set. The difference can be seem by print d.__dict__
and print d
.
To set the magic back, I added the method __setstate__
to AttrDict
:
class AttrDict(dict):
def __init__(self, *args, **kwargs):
super(AttrDict, self).__init__(*args, **kwargs)
self.__dict__ = self
def __setstate__(self, state):
self.__dict__ = state
The code now works as expected.
Upvotes: 0
Views: 1922
Reputation: 104722
This isn't really a multiprocessing issue, as mutlprocessing.Queue
uses pickle
to serialize and unserialize the objects you are sending through it. The problem lies with pickle
not correctly preserving the "magic" behavior you get when you set self.__dict__ = self
.
If you check the object you get in the child process, you'll find that its __dict__
is just an ordinary dictionary, with the same contents as the object itself. When you set a new attribute on the object, its __dict__
gets updated, but the inherited dictionary self
does not. Here's what I mean:
>>> d = AttrDict({"a":1, "b":2})
>>> d2 = pickle.loads(pickle.dumps(d, -1))
>>> d2
{'a': 1, 'b': 2}
>>> d2.b = 3
>>> d2
{'a': 1, 'b': 2}
>>> d2.__dict__
{'a': 1, 'b': 3}
While you could dive into the nitty gritty details of how pickle
works and get your serialization working again, I think a simpler approach would be to rely on less magical behavior by having your class override the __getattr__
, __setattr__
and __delattr__
methods:
class AttrDict(dict):
__slots__ = () # we don't need a __dict__
def __getattr__(self, name): # wrapper around dict.__setitem__, with an exception fix
try:
return self[name]
except KeyError:
raise AttributeError(name) from None # raise the right type of exception
def __delattr__(self, name): # wrapper around dict.__delitem__
try:
del self[name]
except KeyError:
raise AttributeError(name) from None # change exception type here too
__setattr__ = dict.__setitem__ # no special exception rewriting needed here
Instances of this class will work just like your own, but they can be pickled and unpickled successfully:
>>> d = AttrDict({"a":1, "b":2})
>>> d2 = pickle.loads(pickle.dumps(d, -1)) # serialize and unserialize
>>> d2
{'a': 1, 'b': 2}
>>> d2.b=3
>>> d2
{'a': 1, 'b': 3}
Upvotes: 1
Reputation: 184200
My guess is that since it's a subclass of dict
, your AttrDict
is serialized as a dict
. In particular the __dict__
pointing to self
is probably not preserved. You can customize the serialization using certain magic methods; see this article.
Upvotes: 1