chrtan
chrtan

Reputation: 1724

Is there a performance difference in using a tuple over a frozenset as a key for a dictionary?

I have a script that makes many calls to a dictionary using a key consisting of two variables. I know that my program will encounter the two variables again in the reverse order which makes storing the key as a tuple feasible. (Creating a matrix with the same labels for rows and columns)

Therefore, I was wondering if there was a performance difference in using a tuple over a frozenset for a dictionary key.

Upvotes: 8

Views: 4240

Answers (4)

sjelin
sjelin

Reputation: 303

The top answer (Gareth Latty's) appears to be out of date. On python 3.6 hashing a frozenset appears to be much faster, but it depends quite a bit on what you're hashing:

sjelin@work-desktop:~$ ipython
Python 3.6.9 (default, Nov  7 2019, 10:44:02)

In [1]: import time

In [2]: def perf(get_data):
   ...:     tuples = []
   ...:     sets = []
   ...:     for _ in range(10000):
   ...:         t = tuple(get_data(10000))
   ...:         tuples.append(t)
   ...:         sets.append(frozenset(t))
   ...: 
   ...:     start = time.time()
   ...:     for s in sets:
   ...:         hash(s)
   ...:     mid = time.time()
   ...:     for t in tuples:
   ...:         hash(t)
   ...:     end = time.time()
   ...:     return {'sets': mid-start, 'tuples': end-mid}
   ...: 

In [3]: perf(lambda n: range(n))
Out[3]: {'sets': 0.32627034187316895, 'tuples': 0.22960591316223145}

In [4]: from random import random

In [5]: perf(lambda n: (random() for _ in range(n)))
Out[5]: {'sets': 0.3242628574371338, 'tuples': 1.117497205734253}

In [6]: perf(lambda n: (0 for _ in range(n)))
Out[6]: {'sets': 0.0005457401275634766, 'tuples': 0.16936826705932617}

In [7]: perf(lambda n: (str(i) for i in range(n)))
Out[7]: {'sets': 0.33167099952697754, 'tuples': 0.3538074493408203}

In [8]: perf(lambda n: (object() for _ in range(n)))
Out[8]: {'sets': 0.3275420665740967, 'tuples': 0.18484067916870117}

In [9]: class C:
   ...:     def __init__(self):
   ...:         self._hash = int(random()*100)
   ...:         
   ...:     def __hash__(self):
   ...:         return self._hash
   ...:     

In [10]: perf(lambda n: (C() for i in range(n)))
Out[10]: {'sets': 0.32653021812438965, 'tuples': 6.292834997177124}

Some of these differences are enough to matter in a perf context, but only if hashing is actually your bottleneck (which almost never happens).

I'm not sure what to make the fact that the frozensets almost always ran in ~0.33 seconds, while the tuples took anywhere between 0.2 and 6.3 seconds. To be clear, rerunning with the same lambda never changed the results by more than 1%, so it's not like there's a bug.

In python2 the results were different, and the two were generally closer to each other, which is probably why Gareth didn't see the same differences.

Upvotes: 1

senderle
senderle

Reputation: 151087

Without having done any tests, I have a few guesses. For frozensets, cpython stores the hash after it has been calculated; furthermore, iterating over a set of any kind incurs extra overhead because the data is stored sparsely. In a 2-item set, that imposes a significant performance penalty on the first hash, but would probably make the second hash very fast -- at least when the object itself is the same. (i.e. is not a new but equivalent frozenset.)

For tuples, cpython does not store the hash, but rather calculates it every time. So it might be that repeated hashing is slightly cheaper with frozensets. But for such a short tuple, there's probably almost no difference; it's even possible that very short tuples will be faster.

Lattyware's current timings line up reasonably well with my line of reasoning here; see below.

To test my intuition about the asymmetry of hashing new vs. old frozensets, I did the following. I believe the difference in timings is exclusively due to the extra hash time. Which is pretty insignificant, by the way:

>>> fs = frozenset((1, 2))
>>> old_fs = lambda: [frozenset((1, 2)), fs][1]
>>> new_fs = lambda: [frozenset((1, 2)), fs][0]
>>> id(fs) == id(old_fs())
True
>>> id(fs) == id(new_fs())
False
>>> %timeit hash(old_fs())
1000000 loops, best of 3: 642 ns per loop
>>> %timeit hash(new_fs())
1000000 loops, best of 3: 660 ns per loop

Note that my previous timings were wrong; using and created a timing asymmetry that the above method avoids. This new method produces expected results for tuples here -- negligable timing difference:

>>> tp = (1, 2)
>>> old_tp = lambda: [tuple((1, 2)), tp][1]
>>> new_tp = lambda: [tuple((1, 2)), tp][0]
>>> id(tp) == id(old_tp())
True
>>> id(tp) == id(new_tp())
False
>>> %timeit hash(old_tp())
1000000 loops, best of 3: 533 ns per loop
>>> %timeit hash(new_tp())
1000000 loops, best of 3: 532 ns per loop

And, the coup de grace, comparing hash time for a pre-constructred frozenset to hash time for a pre-constructed tuple:

>>> %timeit hash(fs)
10000000 loops, best of 3: 82.2 ns per loop
>>> %timeit hash(tp)
10000000 loops, best of 3: 93.6 ns per loop

Lattyware's results look more like this because they are an average of results for new and old frozensets. (They hash each tuple or frozenset twice, once in creating the dictionary, once in accessing it.)

The upshot of all this is that it probably doesn't matter, except to those of us who enjoy digging around in Python's internals and testing things into oblivion.

Upvotes: 9

Marcin
Marcin

Reputation: 49856

While you can use timeit to find out (and I encourage you to do so, if for no other reason than to learn how it works), in the end it almost certainly doesn't matter.

frozensets are designed specifically to be hashable, so I would be shocked if their hash method is linear time. This kind of micro-optimisation can only matter if you need to get through a fixed (large) number of look-ups in a very short amount of time in a realtime application.

Update: Look at the various updates and comments to Lattyware's answer - it took a lot of collective effort (well, relatively), to strip out the confounding factors, and show that the performance of the two approaches is almost the same. The performance hits were not where they were assumed to be, and it will be the same in your own code.

Write your code to work, then profile to find the hotspots, then apply algorithmic optimisations, then apply micro-optimisations.

Upvotes: 2

Gareth Latty
Gareth Latty

Reputation: 89077

In a quick test, apparently it makes a negligible difference.

python -m timeit -s "keys = list(zip(range(10000), range(10, 10000)))" -s "values = range(10000)" -s "a=dict(zip(keys, values))" "for i in keys:" "  _ = a[i]"
1000 loops, best of 3: 855 usec per loop

python -m timeit -s "keys = [frozenset(i) for i in zip(range(10000), range(10, 10000))]" -s "values = range(10000)" -s "a=dict(zip(keys, values))" "for i in keys:" "  _ = a[i]"
1000 loops, best of 3: 848 usec per loop

I really would just go with what is best elsewhere in your code.

Upvotes: 11

Related Questions