Optimizing python one-liner

Question

I profiled my program, and more than 80% of the time is spent in this one-line function! How can I optimize it? I am running with PyPy, so I'd rather not use NumPy, but since my program is spending almost all of its time there, I think giving up PyPy for NumPy might be worth it. However, I would prefer to use the CFFI, since that's more compatible with PyPy.

#x, y, are lists of 1s and 0s. c_out is a positive int. bit is 1 or 0.
def findCarryIn(x, y, c_out, bit):

    return (2 * c_out +
            bit -
            sum(map(lambda x_bit, y_bit: x_bit & y_bit, x, reversed(y)))) #note this is basically a dot product.

Anand S Kumar · Accepted Answer

Without using Numpy, After testing with timeit , The fastest method for the summing (that you are doing) seems to be using simple for loop and summing over the elements, Example -

def findCarryIn(x, y, c_out, bit):
    s = 0
    for i,j in zip(x, reversed(y)):
        s += i & j
    return (2 * c_out + bit - s)

Though this did not increase the performance by a lot (maybe 20% or so).

The results of timing tests (With different methods , func4 containing the method described above) -

def func1(x,y):
    return sum(map(lambda x_bit, y_bit: x_bit & y_bit, x, reversed(y)))

def func2(x,y):
    return sum([i & j for i,j in zip(x,reversed(y))])

def func3(x,y):
    return sum(x[i] & y[-1-i] for i in range(min(len(x),len(y))))

def func4(x,y):
    s = 0
    for i,j in zip(x, reversed(y)):
        s += i & j
    return s

In [125]: %timeit func1(x,y)
100000 loops, best of 3: 3.02 µs per loop

In [126]: %timeit func2(x,y)
The slowest run took 6.42 times longer than the fastest. This could mean that an intermediate result is being cached
100000 loops, best of 3: 2.9 µs per loop

In [127]: %timeit func3(x,y)
100000 loops, best of 3: 4.31 µs per loop

In [128]: %timeit func4(x,y)
100000 loops, best of 3: 2.2 µs per loop

Optimizing python one-liner

Answers (2)

Related Questions