Reputation: 2872
I wrote a simple function that receives something that implements .index() and a list of characters to check for.
My assumption was that since strings and tuples are both immutable they would have similar performance (or at the least, tuple would outperform list). Instead, tuple seems equivalent to list. Why is that?
from string import ascii_lowercase
from random import randint
from timeit import Timer
def check_index(st, guesses):
for i in guesses:
st.index(i)
if __name__ == "__main__":
num_checks = 10
lst = [n for n in ascii_lowercase]
st = ascii_lowercase
tup = tuple(lst)
guesses = [ascii_lowercase[randint(0, len(ascii_lowercase)-1)]
for n in xrange(num_checks)]
def run_string():
check_index(st, guesses)
def run_list():
check_index(lst, guesses)
def run_tuple():
check_index(tup, guesses)
t2 = Timer(run_list)
print "List", t2.timeit()
t3 = Timer(run_tuple)
print "Tuple", t3.timeit()
t = Timer(run_string)
print "String", t.timeit()
Sample runs (Python 2.7.6)
List 5.26431703568
Tuple 5.28769207001
String 3.16058015823
List 5.30263400078
Tuple 5.17412590981
String 3.17718791962
List 5.21962976456
Tuple 5.35261583328
String 3.22652792931
Upvotes: 6
Views: 819
Reputation: 25954
tl;dr: String index
calls have a lot of optimizations under the hood, and don't have to do "rich comparisons". Both list
s and tuple
s do have to do them; mutability doesn't matter.
If you open up stringobject.c
, you can see that index
calls string_find_internal
, which calls stringlib_find_slice
, which calls stringlib_find
, which calls fastsearch
.
fastsearch
links to this page in its comments, check it out if you're interested. It has some interesting bits on how Boyer-Moore substring searching is implemented in python. But that's not important in your 1-character substring search, which fastsearch
explicitly special-cases:
} else if (mode == FAST_SEARCH) {
for (i = 0; i < n; i++)
if (s[i] == p[0])
return i;
}
Very tight loop, and tight loops are pretty dang fast in C.
Okay, that covers strings. What about tuple
s and list
s?
It's far simpler this time; this is the loop that both types have to do:
for (i = start; i < stop && i < Py_SIZE(self); i++) {
int cmp = PyObject_RichCompareBool(self->ob_item[i], v, Py_EQ);
if (cmp > 0)
return PyInt_FromSsize_t(i);
Instead of just checking equality with ==
, this loop has to do PyObject_RichCompareBool
every iteration through the loop. I won't get into the code in too many specifics (check out object.c
if you're curious), but essentially this has to do some type checking, then see if the types implement rich comparisons, then call that comparison function, check if it returned the NotImplemented
singleton, and finally return the comparison.
So, that loop is the bulk of the work that's done in an index
call, and both lists and tuples have to do it. Strings get to cheat.
Upvotes: 4
Reputation: 42758
index
compares every element with the given object. Since tuples stores like lists python objects, the comparision takes the same time. Strings are faster, because comparing characters is faster than comparing objects.
Upvotes: 1