Reputation: 11837
I'm trying to decide which data structure best fits my needs.
Technical details apart, I may translate the needs of my program to use dictionaries or to use lists, and since performance will be an issue I was wondering what may be a faster solution. I ended up concluding that retrieving/indexing will be the critical operation.
So what's more efficient in terms of memory usage and speed?
Upvotes: 10
Views: 8599
Reputation: 55469
If you don't need to search, use a list. It's faster, and uses less RAM than a dict. For small collections (<100 items) the speed differences are minimal, but for large collections the dict will be around 20% slower. And it will certainly use more RAM.
Here's some timeit
code that compares the access speed of list vs dict. It also shows the RAM consumed by the collection objects themselves (but not that of the data objects they hold).
#!/usr/bin/env python3
''' Compare list vs dict access speed
See http://stackoverflow.com/q/39192442/4014959
Written by PM 2Ring 2016.08.29
'''
from sys import getsizeof
from timeit import Timer
commands = {'dict' : 'for i in r:d[i]', 'list' : 'for i in r:a[i]'}
def time_test(loops, reps):
timings = []
setup = 'from __main__ import r, a, d'
for name, cmd in commands.items():
result = Timer(cmd, setup).repeat(reps, loops)
result.sort()
timings.append((result, name))
timings.sort()
for result, name in timings:
print(name, result)
#Find the ratio of slowest / fastest
tlo, thi = [timings[i][0][0] for i in (0, -1)]
print('ratio: {0:f}\n'.format(thi / tlo))
num = 2000
r = range(num)
a = list(r)
d = {i:i for i in r}
fmt = 'Sizes: num={}, list={}, dict={}'
print(fmt.format(num, getsizeof(a), getsizeof(d)))
loops, reps = 2000, 3
time_test(loops, reps)
output
Sizes: num=2000, list=9056, dict=49200
list [1.2624831110006198, 1.3356713940011105, 1.406396518003021]
dict [1.506095960001403, 1.525646976999269, 1.5623748449979757]
ratio: 1.192963
The speed difference is actually higher than it appears in those results, since the time taken to retrieve the integers from the r
range object is roughly the same as the time taken to do the list & dict accesses. You can measure that by adding an entry like this to the commands
dictionary:
'none': 'for i in r:i'
Upvotes: 10
Reputation: 4893
Well, retrieving from a dict has the same complexity as retrieving from a list:
_dict[1]
_list[1]
but..., with dicts, u can use:
_dict.get(1, 'something else')
It just returns 'something else' if there is no key 1
. Using lists, you cannot do that. You just want to get item indexed by number 1
. If you will ask for item outside your list, so index will be higher than your list length, then your program will raise IndexError
which must be handled. Next problem will be that you will have to know size of that list, so you need to check it first.
Upvotes: 0
Reputation: 9863
https://wiki.python.org/moin/TimeComplexity provides all the info you're looking for
Upvotes: 1