Reputation: 17680
What is the fastest way to find the index of an element in a list of integers?
Now I am doing
if value in mylist:
return mylist.index(value)
but it seems I am doing the same thing two times: to know if value
is in mylist
I also know the index position. I tried also other solutions:
try:
return mylist.index(value)
except ValueError:
return None
or
for i, x in enumerate(mylist):
if x == value:
return i
return None
but all these solutions seems to be slower.
The array is non-sorted with only 4 elements.
Upvotes: 3
Views: 730
Reputation: 26586
Just using if-elses is fast, but if you are always searching on the same list (or your list doesn't change too often) you can be a bit faster by storing the element -> index mapping in a dict and then doing a dictionary lookup.
So your code should look something like this:
# Precompute the mapping.
mapping = { index: value for value, index in enumerate(TEST_LIST) }
# Search function:
def lookup(value):
return mapping.get(value, None)
I ran some tests comparing this with other approaches. Here's my test code:
import timeit
TEST_LIST = [100, -2, 10007, 2**70 + 1]
mapping = { index: value for value, index in enumerate(TEST_LIST) }
NUM_TIMES = 10**6
def by_if_else(lst, value):
if lst[0] == value:
return 0
elif lst[1] == value:
return 1
elif lst[2] == value:
return 2
elif lst[3] == value:
return 3
else:
return None
def by_index(lst, value):
for i in xrange(4):
if lst[i] == value:
return i
return None
def by_exception(lst, value):
try:
lst.index(value)
except ValueError:
return None
def by_iter(lst, value):
for index, element in enumerate(lst):
if element == value:
return value
return None
def by_dict(lst, value):
return mapping.get(value, None)
def TimeFunction(function_name, value):
if 'dict' in function_name:
return timeit.timeit(
stmt = '%s(mapping, %d)' % (function_name, value),
setup = 'from __main__ import %s, mapping' % function_name,
number=NUM_TIMES)
else:
return timeit.timeit(
stmt = '%s(TEST_LIST, %d)' % (function_name, value),
setup = 'from __main__ import %s, TEST_LIST' % function_name,
number=NUM_TIMES)
def RunTestsOn(value):
print "Looking for %d in %s" % (value, str(TEST_LIST))
function_names = [name for name in globals() if name.startswith('by_')]
for function_name in function_names:
print "Function: %s\nTime: %f" % (
function_name, TimeFunction(function_name, value))
def main():
values_to_look_for = TEST_LIST + [ -10**70 - 1, 55, 29]
for value in values_to_look_for:
RunTestsOn(value)
if __name__ == '__main__':
main()
It looks like the if-else approach is faster when the values being searched for are small and are present in the list (I removed runtimes for the other functions):
Looking for 10007 in [100, -2, 10007, 1180591620717411303425L]
Function: by_dict
Time: 0.213232
Function: by_if_else
Time: 0.181917
But slower if the value is large (i.e. comparison in expensive):
Looking for 1180591620717411303425 in [100, -2, 10007, 1180591620717411303425L]
Function: by_dict
Time: 0.223594
Function: by_if_else
Time: 0.380222
Or, when the value isn't present in the list at all (even if the value is small):
Looking for 29 in [100, -2, 10007, 1180591620717411303425L]
Function: by_dict
Time: 0.195733
Function: by_if_else
Time: 0.267689
While it is obvious that using a dict should be faster due to queries on it being O(1) as opposed to O(n) for all the other approaches, for such a small list, the interpreter is probably creating optimized bytecode for the if-else version and the overhead of doing pointer chases through a hashtable offsetting a lot of the speed advantage of the dict. But it still appears to be slightly faster most of the time. I would suggest you test out this approach on your data and see which works better for you.
Upvotes: 1
Reputation: 180411
You can use a set to check for membership, it will be more efficient than checking the list but the greatest overhead is the indexing:
In [54]: l = [1,2,3,4]
In [55]: s = set([1,2,3,4])
In [56]: timeit l.index(6) if 6 in s else False
10000000 loops, best of 3: 79.9 ns per loop
In [57]: timeit l.index(6) if 6 in l else False
10000000 loops, best of 3: 141 ns per loop
In [58]: timeit l.index(4) if 4 in l else False
1000000 loops, best of 3: 381 ns per loop
In [59]: timeit l.index(4) if 4 in s else False
1000000 loops, best of 3: 333 ns per loop
Upvotes: 1
Reputation: 739
As you have only four items, you can also try this :
if value == mylist[0]:
return 0
elif value == mylist[1]:
return 1
elif value == mylist[2]:
return 2
elif value == mylist [3]:
return 3
Let me know how it works in your case. I am curious. :)
Upvotes: 16