user2725109
user2725109

Reputation: 2386

Modifying a dictionary while iterating over it. Bug in Python dict?

The output of

d = {1: 1}
for k in d.keys():
    d['{}'.format(k)] = d.pop(k)
print(d)

is {'1': 1}. The output of

d = {1: 1}
for k in d.keys():
    d['i{}'.format(k)] = d.pop(k)
print(d)

is {'iiiii1': 1}. Is this a bug? I am running Python 3.6.1 :: Anaconda 4.4.0 (x86_64).

Upvotes: 7

Views: 4511

Answers (1)

Martijn Pieters
Martijn Pieters

Reputation: 1121484

No, this is not a bug. This is in fact explicitly documented:

Keys and values are iterated over in an arbitrary order which is non-random, varies across Python implementations, and depends on the dictionary’s history of insertions and deletions. If keys, values and items views are iterated over with no intervening modifications to the dictionary, the order of items will directly correspond.

[...]

Iterating views while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries.

Bold emphasis mine.

You are iterating over the keys, while at the same time both adding and deleting entries in the dictionary. That worked for a few iterations, and then you hit a fail to iterate over all entries case and iteration stopped.

What happens is that you trigger a re-size at 6 additions, and that causes iteration to fail at that point; the 'next' key is now slotted in an 'earlier' slot. This happens for both tests, you just don't realise it iterates 5 times in both cases:

>>> d = {1: 1}
>>> for i, k in enumerate(d):
...     print(i)
...     d['{}'.format(k)] = d.pop(k)
...
0
1
2
3
4
>>> d = {1: 1}
>>> for i, k in enumerate(d):
...     print(i)
...     d['i{}'.format(k)] = d.pop(k)
...
0
1
2
3
4

It is run 5 times because the current dict implementation starts with a hash table of size 8, and a resize is triggered when the table is 2/3s full (your initial dict has 1 entry, 5 insertions make it > (8 * 2/3 == 5.333333). The table is getting filled with DKIX_DUMMY entities, entered when you delete a key (needed to handle hash collisions correctly).

Note that this is all highly implementation dependent. In Python 3.5 and before, both snippets iterate just once (even if you use for k in d: to avoid creating a list object for the keys); iteration continues in 3.6 because the implementation changed and iteration now follows insertion order. Future Python versions are free to alter the implementation again.

Upvotes: 22

Related Questions