Efficiently accessing indices: is storing indices in a dict the fastest way?

Question

I have a list of tuples that I need to access the index of repeatedly. Is the fastest way to do this to store the tuples and their indices in a dict? Or is there a better way?

Given a tuple that exists in an ordered list of tuples, I need to access the index of it efficiently. I can think of two ways to do it; call .index on the list, or store the tuples as keys in a dictionary with the index as their value and look up the index by doing a dictionary lookup.

order = list(itertools.product(range(4),repeat=6))
%timeit order.index(random.choice(order))
# 10000 loops, best of 3: 35 µs per loop

order_dic = {i:order.index(i) for i in order}
%timeit order_dic[random.choice(order)]
# 1000000 loops, best of 3: 443 ns per loop

Tuomas Laakkonen · Accepted Answer

Almost always, the second option will be faster. This is because in the first example, you search the list in order to find the index, an O(n) operation on lists. Whereas in the second example, you lookup a value in the dictionary, which is O(1) on average.

However, space is something to consider here as the second option has the overhead of an additional dict. In Brandon Rhodes's talk at Pycon 2014, he says that dicts tend to use 3 to 4 times more space than they need, so that adding new elements is still fast, which could be a lot of overhead if you have a large list of tuples.

If list of tuples you use as an example is actually the list you are searching through, then there is a fast and memory-efficient way to calculate the index of a given tuple, relying on the way itertools.product works:

a[5] + a[4] >> 2 + a[3] >> 4 + a[2] >> 6 + a[1] >> 8 + a[0] >> 10

This will give the index of tuple a. This is nearly as fast as the second option but does not have the additional space requirement of an extra dict.

(The idea behind this method is that the tuple is essentially a six-digit base-four number, and since itertools.product outputs them in order, this will give the index.)

Efficiently accessing indices: is storing indices in a dict the fastest way?

Answers (2)

Related Questions