Reputation: 1822
I have a simple python script which pickles an object and prints it.
import pickle
o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}
d = pickle.dumps(o)
print(d)
Following are the outputs I get when i execute the same script multiple times:
b'\x80\x03}q\x00(X\x05\x00\x00\x00firstq\x01K\x01X\x05\x00\x00\x00thirdq\x02K\x03X\x06\x00\x00\x00secondq\x03K\x02X\x02\x00\x00\x00lsq\x04]q\x05(K\x01K\x02K\x03eu.'
b'\x80\x03}q\x00(X\x05\x00\x00\x00thirdq\x01K\x03X\x02\x00\x00\x00lsq\x02]q\x03(K\x01K\x02K\x03eX\x05\x00\x00\x00firstq\x04K\x01X\x06\x00\x00\x00secondq\x05K\x02u.'
b'\x80\x03}q\x00(X\x05\x00\x00\x00firstq\x01K\x01X\x06\x00\x00\x00secondq\x02K\x02X\x02\x00\x00\x00lsq\x03]q\x04(K\x01K\x02K\x03eX\x05\x00\x00\x00thirdq\x05K\x03u.'
b'\x80\x03}q\x00(X\x05\x00\x00\x00thirdq\x01K\x03X\x05\x00\x00\x00firstq\x02K\x01X\x02\x00\x00\x00lsq\x03]q\x04(K\x01K\x02K\x03eX\x06\x00\x00\x00secondq\x05K\x02u.'
Is it just a difference in ordering of the properties of the object or is there more to it?
Upvotes: 0
Views: 524
Reputation: 1122222
In Python 3, dictionary order is dependent on hash randomisation. Each time you start your interpreter, a different, random hash seed is used. If you were to print the dictionary, you'd see the different ordering too:
$ bin/python -c "o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}; print(o)"
{'first': 1, 'ls': [1, 2, 3], 'second': 2, 'third': 3}
$ bin/python -c "o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}; print(o)"
{'ls': [1, 2, 3], 'third': 3, 'first': 1, 'second': 2}
$ bin/python -c "o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}; print(o)"
{'second': 2, 'ls': [1, 2, 3], 'third': 3, 'first': 1}
Python uses a random seed to prevent certain types of Denial of Service attacks against programs parsing incoming user data into dictionaries, such as web servers; such an attack could otherwise predict when two strings would cause a hash collision in a dictionary and feed Python values that do nothing but create collisions, slowing down a Python program to a crawl.
You can set the seed to a fixed value with the PYTHONHASHSEED
environment variable, or you can disable hash randomisation altogether:
The integer must be a decimal number in the range [0,4294967295]. Specifying the value 0 will disable hash randomization.
$ PYTHONHASHSEED=0 bin/python -c "o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}; print(o)"
{'third': 3, 'ls': [1, 2, 3], 'first': 1, 'second': 2}
$ PYTHONHASHSEED=0 bin/python -c "o = {'first':1,'second':2,'third':3,'ls':[1,2,3]}; print(o)"
{'third': 3, 'ls': [1, 2, 3], 'first': 1, 'second': 2}
Also see: Why is the order in dictionaries and sets arbitrary?
Upvotes: 2