rafaoc
rafaoc

Reputation: 596

How does the memory allocation work in python dictionaries?

I want to understand how works the memory allocation in python when adding new data to a dictionary. In the code below, I was waiting that every new added data was stacked at the end, however it does not happen.

repetitions = {}
for item in new_deltas:
    list_aux = []
    if float(item[1]) <= 30:
        if float(item[0]) in repetitions:
            aux = repetitions[float(item[0])]
            aux.append(item[1])
            repetitions[float(item[0])] = aux
        else:
            list_aux.append(item[1])
            repetitions[float(item[0])] = list_aux
    print(repetitions)

The results I got are as below. Thus, I would like to understand why the new appended data is not added at the end of the stack, it is added in the middle of it.

My input data is:

new_deltas = [[1.452, 3.292182683944702], [1.449, 4.7438647747039795], [1.494, 6.192960977554321], [1.429, 7.686920166015625]] 

The print line outputs:

{1.452: [3.292182683944702]}
{1.452: [3.292182683944702], 1.449: [4.7438647747039795]}
{1.452: [3.292182683944702], 1.494: [6.192960977554321], 1.449: [4.7438647747039795]}
{1.429: [7.686920166015625], 1.452: [3.292182683944702], 1.494: [6.192960977554321], 1.449: [4.7438647747039795]}

Upvotes: 1

Views: 630

Answers (2)

Raymond Hettinger
Raymond Hettinger

Reputation: 226256

Short answer

Dicts are implemented as hash tables rather than stacks.

Without additional measures that tends to scramble the order of keys

Hash Tables

Prior to Python 3.6, the ordering in a dictionary was randomized by a hash function. Roughly, here's how it worked:

d = {}        # Make a new dictionary
              # Internally 8 buckets are formed:
              #    [ [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] ]
d['a'] = 10   # hash('a') % s gives perhaps bucket 5:
              #    [ [ ] [ ] [ ] [ ] [ ] [('a', 10)] [ ] [ ] ]
d['b'] = 20   # hash('b') % s gives perhaps bucket 2:
              #    [ [ ] [ ] [('b', 20)] [ ] [ ] [('a', 10)] [ ] [ ] ]

So, you can see the ordering of this dict would put 'b' before 'a' because the hash function put 'b' in an earlier bucket.

Newer hash tables that remember insertion order

Starting in Python 3.6, there was a stack added as well. See this proof-of-concept for a better idea of how that works.

Accordingly, dicts started to remember insertion order and this behavior became guaranteed in Python 3.7 and later.

Use OrderedDict on older Python implementations

Prior to 3.7, you can use collections.OrderedDict() to get the same effect.

Deeper dive

For those interested in knowing more about how it works, I have a 37 minute video that shows from first principles all of the techniques used to make modern Python dictionaries.

Upvotes: 3

rkersh
rkersh

Reputation: 4465

Prior to Python 3.6, dictionaries were not ordered (see this stackoverflow thread for more on that). If you are using Python 3.6 or lower (in CPython 3.6 the fact that order is maintained is an implementation detail, but with Python 3.7 it became a language feature), you can use the OrderedDict to get the behavior you want.

For example, you could change the beginning of your code snippet to the following:

from collections import OrderedDict
repetitions = OrderedDict()
...

Upvotes: 2

Related Questions