Is slicing really slower in Python 3.4?

This question and my answer got me thinking about this peculiar difference between Python 2.7 and Python 3.4. Take the simple example code:

import timeit
import dis

c = 1000000
r = range(c)
def slow():
    for pos in range(c):
        r[pos:pos+3]

dis.dis(slow)

time = timeit.Timer(lambda: slow()).timeit(number=1)
print('%3.3f' % time)

In Python 2.7, I consistently get 0.165~ and for Python 3.4 I consistently get 0.554~. The only significant difference between the disassemblies is that Python 2.7 emits the SLICE+3 byte code while Python 3.4 emits BUILD_SLICE followed by BINARY_SUBSCR. Note that I've eliminated the candidates for potential slowdown from the other question, namely strings and the fact that xrange doesn't exist in Python 3.4 (which is supposed to be similar to the latter's range class anyways).

Using itertools' islice yields nearly identical timings between the two, so I highly suspect that it's the slicing that's the cause of the difference here.

Why is this happening and is there a link to an authoritative source documenting change in behavior?

EDIT: In response to the answer, I have wrapped the range objects in list, which did give a noticeable speedup. However as I increased the number of iterations in timeit I noticed that the timing differences became larger and larger. As a sanity check, I replaced the slicing with None to see what would happen.

500 iterations in timeit.

c = 1000000
r = list(range(c))
def slow():
    for pos in r:
        None

yields 10.688 and 9.915 respectively. Replacing the for loop with for pos in islice(r, 0, c, 3) yields 7.626 and 6.270 respectively. Replacing None with r[pos] yielded 20~ and 28~ respectively. r[pos:pos+3] yields 67.531 and 106.784 respectively.

As you can see, the timing differences are huge. Again, I'm still convinced the issue is not directly related to range.

Upvotes: 11

Views: 1562

Answers (1)

user2357112
user2357112

Reputation: 281958

On Python 2.7, you're iterating over a list and slicing a list. On Python 3.4, you're iterating over a range and slicing a range.

When I run a test with a list on both Python versions:

from __future__ import print_function
import timeit
print(timeit.timeit('x[5:8]', setup='x = list(range(10))'))

I get 0.243554830551 seconds on Python 2.7 and 0.29082867689430714 seconds on Python 3.4, a much smaller difference.


The performance difference you see after eliminating the range object is much smaller. It comes primarily from two factors: addition is a bit slower on Python 3, and Python 3 needs to go through __getitem__ with a slice object for slicing, while Python 2 has __getslice__.

I wasn't able to replicate the timing difference you saw with r[pos]; you may have had some confounding factor in that test.

Upvotes: 10

Related Questions