Reputation: 336
So the question regarding the speed of for loops vs while loops has been asked many times before. The for loop is supposed to be faster.
However, when I tested it in Python 3.5.1 the results were as follows:
timeit.timeit('for i in range(10000): True', number=10000)
>>> 12.697646026868842
timeit.timeit('while i<10000: True; i+=1',setup='i=0', number=10000)
>>> 0.0032265179766799434
The while loop runs >3000 times faster than the for loop! I've also tried pre-generating a list for the for loop:
timeit.timeit('for i in lis: True',setup='lis = [x for x in range(10000)]', number=10000)
>>> 3.638794646750142
timeit.timeit('while i<10000: True; i+=1',setup='i=0', number=10000)
>>> 0.0032454974941904524
Which made the for loop 3 times faster, but the difference is still 3 orders of magnitude.
Why does this happen?
Upvotes: 3
Views: 9058
Reputation: 160457
You are timing things incorrectly, setup
is only executed once and then the value of i
is 10000
for all consequent runs. See the documentation on timeit
:
Time number executions of the main statement. This executes the
setup
statement once, and then returns the time it takes to execute the main statement a number of times, measured in seconds as a float.
Additionally verify it by printing i
for each repetition:
>>> timeit('print(i)\nwhile i<10000: True; i+=1',setup='i=0', number=5)
0
10000
10000
10000
10000
As a result, all consequent runs merely perform a comparison (which is True
) and finish early.
Time correctly and see how the for
loop is actually faster:
>>> timeit('i=0\nwhile i<10000: True; i+=1', number=10000)
8.416439056396484
>>> timeit('for i in range(10000): True', number=10000)
5.589155912399292
Upvotes: 3
Reputation: 1122352
You are creating 10k range()
objects. These take some time to materialise. You then have to create iterator objects for those 10k objects too (for the for
loop to iterate over the values). Next, the for
loop uses the iterator protocol by calling the __next__
method on the resulting iterator. Those latter two steps also apply to the for
loop over a list.
But most of all, you are cheating on the while
loop test. The while
loop only has to run once, because you never reset i
back to 0
(thanks to Jim Fasarakis Hilliard pointing that out). You are in effect running a while
loop through a total of 19999 comparisons; the first test runs 10k comparisons, the remaining 9999 tests run one comparison. And that comparison is fast:
>>> import timeit
>>> timeit.timeit('while i<10000: True; i+=1',setup='i=0', number=10000)
0.0008302750065922737
>>> (
... timeit.timeit('while i<10000: True; i+=1', setup='i=0', number=1) +
... timeit.timeit('10000 < 10000', number=9999)
... )
0.0008467709994874895
See how close those numbers are?
My machine is a little faster, so lets create a baseline to compare against; this is using 3.6.1 on a Macbook Pro (Retina, 15-inch, Mid 2015) running on OS X 10.12.5. And lets also fix the while
loop to set i = 0
in the test, not the setup (which is run just once):
>>> import timeit
>>> timeit.timeit('for i in range(10000): pass', number=10000)
1.9789885189966299
>>> timeit.timeit('i=0\nwhile i<10000: True; i+=1', number=10000)
5.172155902953818
Oops, so a correctly running while
is actually slower here, there goes your premise (and mine!).
I used pass
to avoid having to answer question about how fast referencing that object is (it's fast but besides the point). My timings are going to be 6x faster than your machine.
If you wanted to explore why the iteration is faster, you could time the various components of the for
loop in Python, starting with creating the range()
object:
>>> timeit.timeit('range(10000)', number=10000)
0.0036197409499436617
So creating 10000 range()
objects takes more time than running a single while
loop that iterates 10k times. range()
objects are more expensive to create than integers.
This does involves a global name lookup, which is slower, you could make it faster by using setup='_range = range'
then use _range(1000)
; this shaves of about 1/3rd of the timings.
Next, create an iterator for this; here I'll use a local name for the iter()
function, as the for
loop doesn't have to do a hash-table lookup and just reaches for the C function instead. Hard-coded references to a memory location in a binary is a lot faster, of course:
>>> timeit.timeit('_iter(r)', setup='_iter = iter; r = range(10000)', number=10000)
0.0009729859884828329
Fairly fast, but; it takes the same amount of time as your single while
loop iterating 10k times. So creating iterable objects is cheap. The C implementation is faster still. We haven't iterated yet.
Last, we call __next__
on the iterator object, 10k times. This is again done in C code, with cached references to internal C implementations, but with a functools.partial()
object we can at least attempt to get a ball-park figure:
>>> timeit.timeit('n()', setup='from functools import partial; i = iter(range(10000)); n = partial(i.__next__)', number=10000) * 10000
7.759470026940107
Boy, 10k times 10k calls to iter(range(1000)).__next__
takes almost 4x more time than the for
loop managed; this goes to show how efficient the actual C implementation really is.
However, it does illustrate that looping in C code is a lot faster, and this is why the while
loop is actually slower when executed correctly; summing integers and making boolean comparisons in bytecode takes more time than iterating over range()
in C code (where the CPU does the incrementing and comparisons directly in CPU registers):
>>> (
... timeit.timeit('9999 + 1', number=10000 ** 2) +
... timeit.timeit('9999 < 10000', number=10000 ** 2)
... )
3.695550534990616
It is those operations that make the while
loop about 3 seconds slower.
TLDR: You didn't actually test a while
loop correctly. I should have noticed this earlier too.
Upvotes: 13