Reputation: 2703
I'm trying to write some code preventing the update of modified lists. For this i'm calculating the SHA1 hash of the values, but the hexdigest() of this hash produces different result when i restart my ipython interpreter. Why is this?
In [1]: import hashlib
In [2]: hashid = hashlib.sha1()
In [3]: hashid.update(repr(frozenset(sorted(["a","b","c"]))).encode("utf-8"))
In [4]: hashid.hexdigest()
Out[4]: '53ca01b21fd7cb1996634bb45ad74851f73c45d3'
When reinitializing the hashid and doing the hash calculation again in the same ipython3 console it works:
In [5]: hashid = hashlib.sha1()
In [6]: hashid.update(repr(frozenset(sorted(["a","b","c"]))).encode("utf-8"))
In [7]: hashid.hexdigest()
Out[7]: '53ca01b21fd7cb1996634bb45ad74851f73c45d3'
But stopping my console and restarting it it produces different results:
In [7]: exit
rvl@laptop ~/ $ ipython3
In [1]: import hashlib
In [2]: hashid = hashlib.sha1()
In [3]: hashid.update(repr(frozenset(sorted(["a","b","c"]))).encode("utf-8"))
In [4]: hashid.hexdigest()
Out[4]: '6e5813fcb173e35e81d6138eab4d21482885e7eb'
Why is this? And how can i produce the same SHA1 hash/hexdigest result when hasing identical sorted lists?
Upvotes: 0
Views: 1068
Reputation: 155323
You can't rely on the ordering of the repr
of a set
/frozenset
object, since the values have no guaranteed order (and in fact, as an anti-denial of service feature, the hash codes of strings will differ between different runs of the same version of Python, causing the set
ordering to change).
Swap around your frozenset
and sorted
calls to get a consistently reproducible representation. Sorted list
s have guaranteed ordering, while frozenset
will get you uniqueness guarantees:
hashid.update(repr(sorted(frozenset(["a","b","c"]))).encode("utf-8"))
Upvotes: 1