ealeon
ealeon

Reputation: 12452

Python dict: the size affects timing?

Let' say you have one key in dictionary A vs 1 billion keys in dictionary B

Algorithmically a lookup op is O(1)

However, the actual time (program execution time) to look up different based on the size of the dict?

onekey_stime = time.time()
print one_key_dict.get('firstkey')
onekey_dur = time.time() - onekey_stime

manykeys_stime = time.time()
print manykeys_dict.get('randomkey')
manykeys_dur = time.time() - manykey_stime

Would i see any time difference between onekey_dur and manykeys_dur?

Upvotes: 2

Views: 190

Answers (1)

Randy
Randy

Reputation: 14847

Pretty much identical in a test with a small and large dict:

In [31]: random_key = lambda: ''.join(np.random.choice(list(string.ascii_letters), 20))

In [32]: few_keys = {random_key(): np.random.random() for _ in xrange(100)}

In [33]: many_keys = {random_key(): np.random.random() for _ in xrange(1000000)}

In [34]: few_lookups = np.random.choice(few_keys.keys(), 50)

In [35]: many_lookups = np.random.choice(many_keys.keys(), 50)

In [36]: %timeit [few_keys[k] for k in few_lookups]
100000 loops, best of 3: 6.25 µs per loop

In [37]: %timeit [many_keys[k] for k in many_lookups]
100000 loops, best of 3: 7.01 µs per loop

EDIT: For you, @ShadowRanger -- missed lookups are pretty close too:

In [38]: %timeit [few_keys.get(k) for k in many_lookups]
100000 loops, best of 3: 7.99 µs per loop

In [39]: %timeit [many_keys.get(k) for k in few_lookups]
100000 loops, best of 3: 8.78 µs per loop

Upvotes: 3

Related Questions