Reputation: 2675
Let's say I have a class:
class Character():
def __init__(self):
self.race = "Ork"
I create an instance and pickle it.
c = Character()
import pickle
with open(r'C:\tmp\state.bin', 'w+b') as f:
pickle.dump(c, f)
When I try to unpickle it, everything works fine. But what if I want to add another attribute to Character? I go and to this:
class Character():
def __init__(self):
self.race = "Ork"
self.health = 100
Let's say I want to unpickle the old version where we don't have the health
attribute. If I just unpickle the data from the file, the object will not have the health
attribute. To make it in a proper way, following what is written in "Effective Python" book, I need to introduce arguments with default values and bring the copyreg
into play.
So, I do this:
class Character
def __init__(self, race = "Ork", health = 100):
self.race = race
self.health = health
import copyreg
def pickle_character(state):
kwargs = state.__dict__
return unpickle_character, (kwargs, )
def unpickle_character(kwargs):
return Character(**kwargs)
copyreg.pickle(Character, pickle_character)
Now unpickling should work fine:
with open(r'C:\tmp\state.bin', 'rb') as f:
c = pickle.load(f)
This code works fine, however, I still don't see in the c
object our new health
attribute.
The question is simple, why does it happen? Everything should work fine according to "Effective Python".
Upvotes: 1
Views: 918
Reputation: 52079
The standard behaviour for unpickling directly assigns the attributes - it does not use __init__
or __new__
. Therefore, your default arguments do not apply.
When a class instance is unpickled, its
__init__()
method is usually not invoked. 1
Calling __init__
may have side-effects, and may take additional, fewer or other parameters than attributes. This makes it an unsafe default. In effect, pickle uses object.__new__(cls)
to create the instance and then updates its __dict__
.
You must explicitly tell pickle
to use __init__
if you want to.
When using copyreg
, you must pass it the constructor
parameter. Note that this does have a different signature than your unpickle_character
.
Otherwise, your pickling function (pickle_character
) statically defines the function used to unpickle. Since there is no constructor registered for the Character
class and the old pickle does not include it, loading the old pickle does not call your constructor.
def pickle_character(state):
kwargs = state.__dict__
return unpickle_character, (kwargs, )
# ^ unpickler stored for *newly pickled instance*!
# no constuctor stored for *Character class* v
copyreg.pickle(Character, pickle_character)
It is easier to define __setstate__
on your class. This directly receives the state, even from older pickles.
class Character:
def __init__(self, race, health):
self.race = race
self.health = health
# load state with defaults for missing attributes
def __setstate__(self, state):
self.race = state.get('race', 'Ork')
self. health = state.get('health', 100)
If you know that __init__
is safe and backwards-compatible, you can use it to initialise from the pickled state as well.
class Character:
# defaults for every initialisation
def __init__(self, race='Ork', health=100):
self.race = race
self.health = health
def __setstate__(self, state):
# re-use __init__ for initialisation
self.__init__(**state)
Upvotes: 5