Reputation: 4870
I was solving some competitive coding coding challenges. My solution was exceeding time limit when I was searching in list, but was accepted when I searched in string.
Below is a sample code:
CODE
import timeit
list = timeit.Timer("'A' in list('AEIOU')")
set = timeit.Timer("'A' in set('AEIOU')")
string = timeit.Timer("'A' in 'AEIOU'")
print 'List search :',list.timeit(999999)
print 'Set search',set.timeit(999999)
print 'String search',string.timeit(999999)
OUTPUT
List search : 1.0873670578
Set search 1.0083398819
String search 0.0997061729431
*This output was obtained in ideone.com ... http://ideone.com/5FxkX2
**Output may vary from system to system but the trend is same.
I found that string search is incredibly faster. I googled but didn't found a satisfactory reasoning to this. Please help me understand why is string search fastest?
Upvotes: 3
Views: 293
Reputation: 1122282
This has very little to do with actual search speed and everything with having to look up a global name (set
or list
) and calling that global (which includes a frame save and restore).
Test the actual in
operator alone, not with the call:
>>> import timeit
>>> timeit.Timer("'A' in l", "l = list('AEIOU')").timeit(999999)
0.0490870475769043
>>> timeit.Timer("'A' in s", "s = set('AEIOU')").timeit(999999)
0.0507349967956543
>>> timeit.Timer("'A' in s", "s = 'AEIOU'").timeit(999999)
0.04982495307922363
Now the times are too close to call, because your objects are way too small. Use much larger inputs to actually see a difference.
Just using more characters already increases the contrast somewhat:
>>> timeit.Timer("'A' in l", "from string import ascii_letters, digits, whitespace; l = list(ascii_letters + digits + whitespace)").timeit(999999)
0.38520193099975586
>>> timeit.Timer("'A' in s", "from string import ascii_letters, digits, whitespace; s = set(ascii_letters + digits + whitespace)").timeit(999999)
0.046868085861206055
>>> timeit.Timer("'A' in s", "from string import ascii_letters, digits, whitespace; s = ascii_letters + digits + whitespace").timeit(999999)
0.06492304801940918
Now set()
objects are starting to win, and list objects show their slower speed.
You should also test for the negative case, where the character is not present in the object:
>>> timeit.Timer("'\xff' in l", "from string import ascii_letters, digits, whitespace; l = list(ascii_letters + digits + whitespace)").timeit(999999)
0.909243106842041
>>> timeit.Timer("'\xff' in s", "from string import ascii_letters, digits, whitespace; s = set(ascii_letters + digits + whitespace)").timeit(999999)
0.05034899711608887
>>> timeit.Timer("'\xff' in s", "from string import ascii_letters, digits, whitespace; s = ascii_letters + digits + whitespace").timeit(999999)
0.09932804107666016
Here the set()
object really excels.
Note that the string search still beats the list search in both cases; that's because iterating over a C array and comparing bytes is faster than comparing objects in a dynamic array, where you cannot up-front know that all the objects are the same type.
Upvotes: 4
Reputation: 117886
Your test is biased, you need to use the setup
argument to neglect the time to create the list
or set
>>> timeit.timeit("'A' in l", "l = list('AEIOU')")
0.06389708405846951
>>> timeit.timeit("'A' in s", "s = set('AEIOU')")
0.05960524183085081
>>> timeit.timeit("'A' in s", "s = 'AEIOU'")
0.05756433387793081
That being said, the in
operation for str
and list
is linear O(N)
and set
is constant O(1)
(neglecting lots of hash collisions). So for a much larger example, the set
would be the fastest, and str
and list
would be very similar.
Upvotes: 6