Mauricio
Mauricio

Reputation: 3129

How to generate hash from Python object for versioning?

I want to generate a hash string for my python Object so I know if it has changed. I tried hash() function but it always return the same value for the given object even if I change it's properties.

Let's say I have the following scenario:

class User(object):
    def __init__(self, name, address):
        self.name = name
        self.address = address

user = User("Anthony", "Street 1")
user.hash()
=> "21e12i24g"

user.address = "Street 2"
user.hash()
=> "85f7ff5a1"

Using hash() function on that example would return the same result for both calls.

Upvotes: 2

Views: 1176

Answers (1)

Green Cloak Guy
Green Cloak Guy

Reputation: 24691

The typical way to give an object a hash function is to add a __hash__() method to the object, which is called by the built-in function hash(). Then, tuple your object's attributes, and return the hash of that collection:

class User(object):
    def __init__(self, name, address):
        self.name = name
        self.address = address
    def __hash__(self):
        return hash((self.name, self.address))

user = User("Anthony", "Street 1")
print(hash(user))
# 4909374541336696414
user.address = "Street 2"
print(hash(user))
# -1615107979785300685

For a more generalized solution that wouldn't require you to explicitly enumerate each property of your object, you could do so dynamically. For example, the following will look at all the attributes of User, list them as key-value tuples inside a larger tuple, and take the hash of that. You could also add an element at the front or back, such as self.__class__, if you wanted that to be a factor in uniqueness.

class User(object):
    def __init__(self, name, address):
        self.name = name
        self.address = address
    def __hash__(self):
        return hash(tuple((k, v) for k,v in vars(self).items()))

user = User("Anthony", "Street 1")
print(hash(user))
# -8646451939475098773
user.address = "Street 2"
print(hash(user))
# 8843995983070120839

It's important to note that, for the purposes of hashing, tuple must be used instead of list, dict, or set, as none of those types are hashable due to being mutable. tuple, being immutable, is safe to hash. Note also that this is a python design decision, and doesn't necessarily mean that you can't design mutable objects (such as User) with hashes.

Upvotes: 3

Related Questions