Reputation: 435
I was coding a Euler problem, and I ran into question that sparked my curiosity. I have two snippets of code. One is with lists the other uses dictionaries.
using lists:
n=100000
num=[]
suma=0
for i in range(n,1,-1):
tmp=tuple(set([n for n in factors(i)]))
if len(tmp) != 2: continue
if tmp not in num:
num.append(tmp)
suma+=i
using dictionaries:
n=100000
num={}
suma=0
for i in range(n,1,-1):
tmp=tuple(set([n for n in factors(i)]))
if len(tmp) != 2: continue
if tmp not in num:
num[tmp]=i
suma+=i
I am only concerned about performance. Why does the second example using dictionaries run incredibly fast, faster than the first example with lists. the example with dictionaries runs almost thirty-fold faster!
I tested these 2 code using n=1000000, and the first code run in 1032 seconds and the second one run in just 3.3 second,,, amazin'!
Upvotes: 40
Views: 60102
Reputation: 1
python lists are indexed lists. Lookups based on index are much faster than in regular linked lists. Even the link on time complexity above says that list lookups are O(1).
Upvotes: 0
Reputation: 42748
If it's about speed, you should not create any lists:
n = 100000
factors = ((frozenset(factors(i)), i) for i in range(2, n+1))
num = {k:v for k,v in factors if len(k)==2}
suma = sum(num.values())
Upvotes: 6
Reputation: 8600
In Python, the average time complexity of a dictionary key lookup is O(1), since they are implemented as hash tables. The time complexity of lookup in a list is O(n) on average. In your code, this makes a difference in the line if tmp not in num:
, since in the list case, Python needs to search through the whole list to detect membership, whereas in the dict case it does not except for the absolute worst case.
For more details, check out TimeComplexity.
Upvotes: 54
Reputation: 446
In a list, the code if tmp not in num:
is O(n), while it is O(lgn) in dict.
Edit: The dict is based on hashing, so it is much quicker than liner list search. Thanks @user2357112 for point this.
Upvotes: 3
Reputation: 1577
I am almost positive that the "magic sauce" using a dictionary lies in the fact that the dictionary is comprised of key->value pairs.
in a list, youre dealing with arrays, which means the for loop has to start at index 0 inside of your list in order to loop through every record.
the dictionary just has to find the key->value pair in question on the first 'go-round' and return it, hence the speed...
basically, testing for membership in a set of key->value pairs is a lot quicker than searching an entire list for a value. the larger your list gets the slower it will be... but this isnt always the case, there are scenarios where a list will be faster... but i believe this may be the answer youre looking for
Upvotes: 1