Javier Novoa C.
Javier Novoa C.

Reputation: 11837

Which is faster a dictionary retrieve or a list indexing?

I'm trying to decide which data structure best fits my needs.

Technical details apart, I may translate the needs of my program to use dictionaries or to use lists, and since performance will be an issue I was wondering what may be a faster solution. I ended up concluding that retrieving/indexing will be the critical operation.

So what's more efficient in terms of memory usage and speed?

Upvotes: 10

Views: 8599

Answers (3)

PM 2Ring
PM 2Ring

Reputation: 55469

If you don't need to search, use a list. It's faster, and uses less RAM than a dict. For small collections (<100 items) the speed differences are minimal, but for large collections the dict will be around 20% slower. And it will certainly use more RAM.

Here's some timeit code that compares the access speed of list vs dict. It also shows the RAM consumed by the collection objects themselves (but not that of the data objects they hold).

#!/usr/bin/env python3

''' Compare list vs dict access speed 

    See http://stackoverflow.com/q/39192442/4014959

    Written by PM 2Ring 2016.08.29
'''

from sys import getsizeof
from timeit import Timer

commands = {'dict' : 'for i in r:d[i]', 'list' : 'for i in r:a[i]'}

def time_test(loops, reps):
    timings = []
    setup = 'from __main__ import r, a, d'
    for name, cmd in commands.items():
        result = Timer(cmd, setup).repeat(reps, loops)
        result.sort()
        timings.append((result, name))

    timings.sort()
    for result, name in timings:
        print(name, result)

    #Find the ratio of slowest / fastest
    tlo, thi = [timings[i][0][0] for i in (0, -1)]
    print('ratio: {0:f}\n'.format(thi / tlo))

num = 2000
r = range(num)
a = list(r)
d = {i:i for i in r}

fmt = 'Sizes: num={}, list={}, dict={}'
print(fmt.format(num, getsizeof(a), getsizeof(d)))

loops, reps = 2000, 3
time_test(loops, reps)

output

Sizes: num=2000, list=9056, dict=49200
list [1.2624831110006198, 1.3356713940011105, 1.406396518003021]
dict [1.506095960001403, 1.525646976999269, 1.5623748449979757]
ratio: 1.192963

The speed difference is actually higher than it appears in those results, since the time taken to retrieve the integers from the r range object is roughly the same as the time taken to do the list & dict accesses. You can measure that by adding an entry like this to the commands dictionary:

'none': 'for i in r:i'

Upvotes: 10

turkus
turkus

Reputation: 4893

Well, retrieving from a dict has the same complexity as retrieving from a list:

_dict[1]
_list[1]

but..., with dicts, u can use:

_dict.get(1, 'something else')

It just returns 'something else' if there is no key 1. Using lists, you cannot do that. You just want to get item indexed by number 1. If you will ask for item outside your list, so index will be higher than your list length, then your program will raise IndexError which must be handled. Next problem will be that you will have to know size of that list, so you need to check it first.

Upvotes: 0

BPL
BPL

Reputation: 9863

https://wiki.python.org/moin/TimeComplexity provides all the info you're looking for

Upvotes: 1

Related Questions