Reputation: 3396
This question and my answer got me thinking about this peculiar difference between Python 2.7 and Python 3.4. Take the simple example code:
import timeit
import dis
c = 1000000
r = range(c)
def slow():
for pos in range(c):
r[pos:pos+3]
dis.dis(slow)
time = timeit.Timer(lambda: slow()).timeit(number=1)
print('%3.3f' % time)
In Python 2.7, I consistently get 0.165~
and for Python 3.4 I consistently get 0.554~
. The only significant difference between the disassemblies is that Python 2.7 emits the SLICE+3
byte code while Python 3.4 emits BUILD_SLICE
followed by BINARY_SUBSCR
. Note that I've eliminated the candidates for potential slowdown from the other question, namely strings and the fact that xrange
doesn't exist in Python 3.4 (which is supposed to be similar to the latter's range
class anyways).
Using itertools'
islice
yields nearly identical timings between the two, so I highly suspect that it's the slicing that's the cause of the difference here.
Why is this happening and is there a link to an authoritative source documenting change in behavior?
EDIT: In response to the answer, I have wrapped the range
objects in list
, which did give a noticeable speedup. However as I increased the number of iterations in timeit
I noticed that the timing differences became larger and larger. As a sanity check, I replaced the slicing with None
to see what would happen.
500 iterations in timeit
.
c = 1000000
r = list(range(c))
def slow():
for pos in r:
None
yields 10.688
and 9.915
respectively. Replacing the for loop with for pos in islice(r, 0, c, 3)
yields 7.626
and 6.270
respectively. Replacing None
with r[pos]
yielded 20~
and 28~
respectively. r[pos:pos+3]
yields 67.531
and 106.784
respectively.
As you can see, the timing differences are huge. Again, I'm still convinced the issue is not directly related to range
.
Upvotes: 11
Views: 1562
Reputation: 281958
On Python 2.7, you're iterating over a list and slicing a list. On Python 3.4, you're iterating over a range
and slicing a range
.
When I run a test with a list on both Python versions:
from __future__ import print_function
import timeit
print(timeit.timeit('x[5:8]', setup='x = list(range(10))'))
I get 0.243554830551 seconds on Python 2.7 and 0.29082867689430714 seconds on Python 3.4, a much smaller difference.
The performance difference you see after eliminating the range
object is much smaller. It comes primarily from two factors: addition is a bit slower on Python 3, and Python 3 needs to go through __getitem__
with a slice object for slicing, while Python 2 has __getslice__
.
I wasn't able to replicate the timing difference you saw with r[pos]
; you may have had some confounding factor in that test.
Upvotes: 10