yoav
yoav

Reputation: 322

Why updating Dictionary is not sorted by the same order

I wrote this code:

words_dict = {}
my_list = ["a", "b", "c", "d", "e"]
for st in my_list:
    words_dict.update({st: 0})
print words_dict

The output I expected is:

{'a': 0, 'b': 0, 'c': 0, 'd': 0, 'e': 0}

But I get

{'a': 0, 'c': 0, 'b': 0, 'e': 0, 'd': 0}

Why is this happening and how can I get {'a':0, 'b':0, 'c':0, 'd':0, 'e':0} instead?

Upvotes: 1

Views: 614

Answers (1)

Vlad Bezden
Vlad Bezden

Reputation: 89547

Before Python 3.7 dict was not ordered and if you wanted to preserve the order of items in a dictionary you had to use OrderedDict

This happened because the dictionary type previously implemented its hash table algorithm with a combination of the hash built-in function and a random seed that was assigned when the Python interpreter started. Together, these behaviors caused dictionary orderings to not match insertion order and to randomly shuffle between program executions.

In Python 3.7 and above order of items in the dictionary are preserved, and you don't have to use OrderedDict anymore.

the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec.

The way that dictionaries preserve insertion ordering is now part of the Python language specification. You can rely on this behavior and even make it part of the APIs you design for your classes and functions.

I also measured the performance of creation regular dict and OrderedDict and regular dict is about 2.5-3 times faster than OrderedDict

from collections import OrderedDict

data = [(i, chr(i)) for i in range(65, 91)]


%%timeit

d = dict(data)

2.27 µs ± 235 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


%%timeit

d = OrderedDict(data)

6.59 µs ± 1.32 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)


%%timeit

d = {}

for k, v in data:
    d[k] = v

4.84 µs ± 1.31 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)


%%timeit

d = OrderedDict()

for k, v in data:
    d[k] = v

7.48 µs ± 1.6 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Upvotes: 3

Related Questions