Paul Osborne
Paul Osborne

Reputation: 5124

Pickled Object Versioning

I am working on a project where we have a large number of objects being serialized and stored to disk using pickle/cPickle.

As the life of the project progresses (after release to customers in the field) it is likely that future features/fixes will require us to change the signature of some of our persisted objects. This could be the addition of fields, removing of fields, or even just changing the invariants on a piece of data.

Is there a standard way to mark an object that will be pickled as having a certain version (like serialVersionUID in Java)? Basically, if I am restoring an instance of Foo version 234 but the current code is 236 I want to receive some notification on unpickle. Should I just go ahead and roll out my own solution (could be a PITA).

Thanks

Upvotes: 13

Views: 3888

Answers (2)

Jasha
Jasha

Reputation: 7659

Consider the following class mixin suggested by Tomasz Früboes here.

# versionable.py
class Versionable(object):
    def __getstate__(self):
        if not hasattr(self, "_class_version"):
            raise Exception("Your class must define _class_version class variable")
        return dict(_class_version=self._class_version, **self.__dict__)
    def __setstate__(self, dict_):
        version_present_in_pickle = dict_.pop("_class_version")
        if version_present_in_pickle != self._class_version:
            raise Exception("Class versions differ: in pickle file: {}, "
                            "in current class definition: {}"
                            .format(version_present_in_pickle,
                                    self._class_version))
        self.__dict__ = dict_

The __getstate__ method is called by pickle upon pickling, and __setstate__ is called by pickle upon unpickling. This mix-in class can be used as a subclass of classes whose version you want to keep track of. This is to be used as follows:

# bla.py
from versionable import Versionable
import pickle

class TestVersioning(Versionable):
    _class_version = 1

t1 = TestVersioning()

t_pickle_str = pickle.dumps(t1)

class TestVersioning(Versionable):
    _class_version = 2

t2 = pickle.loads(t_pickle_str) # Throws exception about wrong class version

Upvotes: 7

Alex Martelli
Alex Martelli

Reputation: 881863

The pickle format has no such proviso. Why don't you just make the "serial version number" part of the object's attributes, to be pickled right along with the rest? Then the "notification" can be trivially had by comparing actual and desired version -- don't see why it should be a PITA.

Upvotes: 7

Related Questions