CamHart
CamHart

Reputation: 4335

Python for each faster than for indexed?

In python, which is faster?

1

for word in listOfWords:
    doSomethingToWord(word)

2

for i in range(len(listOfWords)):
    doSomethingToWord(listOfWords[i])

Of course I'd use xrange in python 2.x.

My assumption is 1. is faster than 2. If so, why is it?

Upvotes: 0

Views: 682

Answers (4)

PaulMcG
PaulMcG

Reputation: 63729

In addition to the speed advantage, 1 is "cleaner-looking", but also will work for sequences that do not support len, namely generator expressions and the results from generator functions. To use solution 2, you would first have to convert the generator to a list in order to get its length if you could. But what if the generator is generating the list of all prime numbers, and doSomething is looking for the first value > 100?

for num in prime_number_generator():
    if num > 100: return num

There is no way to convert this to the second form, since this generator has no end.

Also, what if it is very expensive to create the elements of the list (as in fetching from a database, or remote web server)? If you are looking for a matching value out of a generated set of N values, with #1 you could exit as soon as you found a match, and avoid on average the generation of N/2 values. To use #2, you first have to generate all N values in order to get the length in order to make the range.

There is a reason Python 3 converted many builtins to return iterators instead of lists - they are more flexible.

What is Pythonic?
"for i in range(len(seq)):"? No.
Use "for x in seq:"

Upvotes: 1

Vishnu Upadhyay
Vishnu Upadhyay

Reputation: 5061

simply try timeit.

  In [2]:  def solve(listOfWords):
         for word in range(len(listOfWords)):
               pass
   ...:     

In [3]: %timeit solve(xrange(10**5))
100 loops, best of 3: 4.34 ms per loop

In [4]:  def solve(listOfWords):
         for word in listOfWords:
               pass
   ...:     

In [5]: %timeit solve(xrange(10**5))
1000 loops, best of 3: 1.84 ms per loop

Upvotes: 2

Salvador Dali
Salvador Dali

Reputation: 222541

Instead of asking this questions, you can always try do them by yourself. It is not hard. Super simple benchmarking will show you the difference.

from datetime import datetime
arr = [4 for _ in xrange(10**8)]

startTime = datetime.now()
for i in arr:
    i
print datetime.now() - startTime

startTime = datetime.now()
for i in xrange(len(arr)):
    arr[i]
print datetime.now() - startTime

On my machine it is:

0:00:04.822513
0:00:05.676396

Note that the list you are iterating should be pretty big to see the difference. The second loop is longer because each time you need to make a look up by index (arr[i]) and also to generate the values for xrange.

Please do not spend too much time in mostly useless microoptimization, rather try to look whether you can improve the computational complexity of your inner loop functions.

Upvotes: 3

Duncan
Duncan

Reputation: 95652

Use Python's timeit module to answer this kind of question:

duncan@ubuntu:~$ python -m timeit -s "listOfWords=['hello']*1000" "for word in listOfWords: len(word)"
10000 loops, best of 3: 37.2 usec per loop
duncan@ubuntu:~$ python -m timeit -s "listOfWords=['hello']*1000" "for i in range(len(listOfWords)): len(listOfWords[i])"
10000 loops, best of 3: 52.1 usec per loop

Upvotes: 5

Related Questions