Options to make pypy run faster

Question

I wrote a test code to see how pypy can optimize python code well and run faster. It is a non-in-place quick sort and supposed to run slow enough to make the difference. By simply replacing python with pypy, the result is actually slower from 16 seconds to 25 seconds. I searched a bit and found the opt option, but I can't find a way to apply it to pypy. I'm quite new to python, so help me a bit.

import sys

def sqsort(xxs):
    if len(xxs) == 1 or len(xxs) == 0:
        return xxs
    x = xxs[0]
    xs = xxs[1 :]
    l = []
    g = []
    for x2 in xs:
        if x2 < x:
            l.append(x2)
        if x2 >= x:
            g.append(x2)
    return sqsort(l) + [x] + sqsort(g)

sys.setrecursionlimit(30000)
l = list(reversed(range(15000)))
print(l)
print(sqsort(l))

Armin Rigo · Accepted Answer

Not a full answer, but indeed the problem is recursion, which PyPy is not good at. Here's the same algorithm rewritten to use recursion only for the shorter of the sublists (either l or g), and iteration for the longer one. This version is still recursive, but the recursion is guaranteed to be limited to O(log(n)) times instead of O(n). It is now 4-5x faster in PyPy.

Note that we can't say that the overall time of this algorithm (in either version) is really O(n log(n)), because it is full of list concatenations which take time too. You can't treat Python's lists like you would treat Haskell's or Lisp's "cons" chained lists; in Python, lists are variable-sized array.

def sqsort(xxs):
    left, right = [], []
    while True:
        if len(xxs) == 1 or len(xxs) == 0:
            return left + xxs + right
        x = xxs[0]
        xs = xxs[1 :]
        l = []
        g = []
        for x2 in xs:
            if x2 < x:
                l.append(x2)
            if x2 >= x:
                g.append(x2)
        if len(l) <= len(g):
            left += sqsort(l) + [x]
            xxs = g
        else:
            right = [x] + sqsort(g) + right
            xxs = l

l = list(reversed(range(15000)))
print(l)
print(sqsort(l))

Options to make pypy run faster

Answers (1)

Related Questions