Reputation: 163
I am trying to use pickle to transfer python objects over the wire between 2 servers. I created a simple class, that subclasses dict
and I am trying to use pickle for the marshalling:
def value_is_not_none(value):
return value is not None
class CustomDict(dict):
def __init__(self, cond=lambda x: x is not None):
super().__init__()
self.cond = cond
def __setitem__(self, key, value):
if self.cond(value):
dict.__setitem__(self, key, value)
I first tried to use pickle
for the marshalling, but when I un-marshalled I received an error related to the lambda
expression.
Then I tried to do the marshalling with dill
but it seemed the __init__
was not called.
Then I tried again with pickle
, but I passed the value_is_not_none()
function as the cond
parameter - again the __init__()
does not seemed to be invoked and the un-marshalling failed on the __setitem__()
(cond
is None
).
Why is that? what am I missing here?
If I try to run the following code:
obj = CustomDict(cond=value_is_not_none)
obj['hello'] = ['world']
payload = pickle.dumps(obj, protocol=pickle.HIGHEST_PROTOCOL)
obj2 = pickle.loads(payload)
it fails with
AttributeError: 'CustomDict' object has no attribute 'cond'
This is a different question than: Python, cPickle, pickling lambda functions
as I tried using dill
with lambda
and it failed to work, and I also tried passing a function and it also failed.
Upvotes: 3
Views: 888
Reputation: 1124170
pickle
is loading your dictionary data before it has restored the attributes on your instance. As such the self.cond
attribute is not yet set when __setitem__
is called for the dictionary key-value pairs.
Note that pickle
will never call __init__
; instead it'll create an entirely blank instance and restore the __dict__
attribute namespace on that directly.
You have two options:
default to cond=None
and ignore the condition if it is still set to None
:
class CustomDict(dict):
def __init__(self, cond=None):
super().__init__()
self.cond = cond
def __setitem__(self, key, value):
if getattr(self, 'cond', None) is None or self.cond(value):
dict.__setitem__(self, key, value)
The getattr()
there is needed because a blank instance has no cond
attribute at all (it is not set to None
, the attribute is entirely missing). You could add cond = None
to the class:
class CustomDict(dict):
cond = None
and then just test for if self.cond is None or self.cond(value):
.
Define a custom __reduce__
method to control how the initial object is created when restored:
def _default_cond(v): return v is not None
class CustomDict(dict):
def __init__(self, cond=_default_cond):
super().__init__()
self.cond = cond
def __setitem__(self, key, value):
if self.cond(value):
dict.__setitem__(self, key, value)
def __reduce__(self):
return (CustomDict, (self.cond,), None, None, iter(self.items()))
__reduce__
is expected to return a tuple with:
(self.cond,)
we ensure that the new instance is created with cond
passed in as an argument and now CustomDict.__init__()
will be called.__setstate__
method (ignored here) and for list-like types, so we set these to None
.Note that I replaced the default value for cond
with a function here too so you don't have to rely on dill
for the pickling.
Upvotes: 2