Reputation: 35

python: the speed of sum

I know the fastest way to sum a list of number is use the built in function sum. Using a for loop could be a slower way to do the sum than using reduce. However when I try it, it is not true. Can someone explain this result?

import time, random, operator

sample = [random.randrange(10000) for _ in range(1000000)]

def use_for(l):
    acc = 0
    for n in l:
        acc += n
    print acc

def use_lambda(l):
    print reduce(operator.add, l)

print time.time()
use_for(l)
print time.time()
use_lambda(l)
print time.time()

the time I get:

1479671513.04
4998734199
1479671513.07
4998734199
1479671513.13

Upvotes: 1

Answers (3)

zwol

Reputation: 140866

Let me show you how to do this more systematically. First, you should use the timeit module for benchmarking. It's a little awkward to use correctly but it is significantly more accurate. Second, make absolutely certain you are not doing any work other than the work you care about benchmarking within the test. In particular, you should not print out anything in the functions under test, because printing things is expensive. Third, you should test each candidate function over a range of lengths and then graph the result. Fourth, you don't need to go up to a million numbers to get useful results.

import csv
import operator
import random
import sys
from functools import partial, reduce
from timeit import timeit

lengths = [10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000, 20000, 50000]

samples = [ [random.randrange(10000) for i in range(n)]
            for n in lengths ]

def use_for(l):
    acc = 0
    for n in l: acc += n
    return acc

def use_reduce(l):
    return reduce(operator.add, l)

def use_sum(l):
    return sum(l)

def main():
    with sys.stdout as ofp:
        wr = csv.writer(ofp, lineterminator='\n', quoting=csv.QUOTE_MINIMAL)
        wr.writerow(('len','for loop','reduce','sum'))

        for length, sample in zip(lengths, samples):
            t_for = timeit(partial(use_for, sample), number=1000)
            t_red = timeit(partial(use_reduce, sample), number=1000)
            t_sum = timeit(partial(use_sum, sample), number=1000)
            wr.writerow((length, t_for, t_red, t_sum))

main()

We run this test program, and then we plot the output. You didn't say whether you were using Python 2 or 3, so I wrote the above to work with either, and I tested it both ways. [EDIT: And since another answer mentioned it, I've now tested PyPy as well.] Don't worry about the details of what I'm doing to make plots - ggplot is well worth learning, but it, and the R language it's embedded in, can be pretty darn cryptic.

$ python2 sumbench.py > sumbench-2.csv
$ python3 sumbench.py > sumbench-3.csv
$ pypy    sumbench.py > sumbench-P.csv
$ R --quiet
> suppressPackageStartupMessages({ library(reshape2); library(ggplot2); })
> data2 <- melt(read.csv('sumbench-2.csv'), id.var='len')
> data3 <- melt(read.csv('sumbench-3.csv'), id.var='len')
> dataP <- melt(read.csv('sumbench-P.csv'), id.var='len')
> data2$interp <- ordered('CPython 2', levels=c('CPython 2','CPython 3','PyPy'))
> data3$interp <- ordered('CPython 3', levels=c('CPython 2','CPython 3','PyPy'))
> dataP$interp <- ordered('PyPy',      levels=c('CPython 2','CPython 3','PyPy'))
> data <- rbind(data2, data3, dataP)
> colnames(data) <- c("Input length", "Algorithm", "Time (ms)", "Interpreter")
> ggplot(data, aes(x=`Input length`, y=`Time (ms)`,
                   colour=`Algorithm`, linetype=`Algorithm`)) +
      facet_grid(.~`Interpreter`) + geom_line() +
      theme_grey(base_size=9) +
      theme(legend.position=c(0.01,0.98), legend.justification=c(0,1))

Which pretty clearly demonstrates that using reduce is indeed slower than the for loop, but sum is much faster than either. It also clearly demonstrates that CPython 3.5 is slower at this than 2.7, which is sad but expected. PyPy is not only a solid 5x faster than either of them, but all three algorithms perform equally well! That's what happens when you throw a genuine optimizing compiler at this sort of code. (PyPy is faster than CPython's sum() intrinsic because it can figure out that all the elements of the array are numeric and slice out a bunch of per-element overhead. The sum method of a NumPy array would probably be as fast or faster than PyPy.)

It's often good to plot data like this on a log-log scale - this is also why I picked the lengths I did:

> last_plot() + scale_x_log10() + scale_y_log10()

See how they've all got roughly the same slope now? That means the asymptotic complexity of all three techniques is the same, O(n), just different constant factors. Asymptotic complexity is important because it lets you predict how long bigger inputs will take. In this case, we could just extend the three lines out to a million on the x-axis if we wanted to know how long they would take for your original test case. With a different big-O we'd see curves, and we'd need to extrapolate them differently.

We can also see that sum() has a bend in its curve, which was completely invisible on the linear plot; that means there might be some special casing of short lists in the implementation. And it's also clearer that reduce has very nearly the same performance as the hand-written for loop in 2 but not 3; reduce is no longer a built-in function in 3, but it's still implemented in compiled code, so I don't have an explanation for this. And we can see that PyPy is dramatically slower at the beginning, in an unpredictable way: that's because the cost of just-in-time compilation of the functions being benchmarked has been ascribed to the early calls. I could add a "warm-up" step to the benchmark and make that go away, but it's a good thing to know about.

On the other hand, the fact that CPython 3 is significantly slower than CPython 2 is much harder to see on the log-log plot.

Upvotes: 5

dawg

Reputation: 104102

I get dramatically different timings for sum.

You might want to use timeit as a better way to time small bits of code. Here is an example:

from __future__ import print_function
import operator
from functools import reduce

def f1(l):
    return sum(l)

def f2(l):
    return reduce(operator.add, l)

def f3(l):
    s=0
    for e in l:
        s+=e
    return s    

if __name__=='__main__':
    import timeit 
    import random  
    l=[random.randrange(10000) for _ in range(1000000)] 
    for f in (f1, f2, f3):
        print("   ",f.__name__, timeit.timeit("f(l)", setup="from __main__ import f, l", number=100) )

Python3 prints:

f1 0.7481771620000472
f2 6.92161642699989
f3 5.201012654999886

Python2 prints:

f1 0.554444074631
f2 4.81940102577
f3 3.65543603897

PyPy:

f1 0.108825922012
f2 0.112055063248
f3 0.105736970901

Upvotes: 2

Aditya

Reputation: 630

After reading your question the first thing that came to my mind:

Using a for loop could be a slower way to do the sum than using reduce

Is this based on some documentation or data or just an assumption? I read around a little, and it looks like it is just an assumption.

Based on python document on functional programming

Functional design may seem like an odd constraint to work under. Why should you avoid objects and side effects? There are theoretical and practical advantages to the functional style:

Formal provability.

Modularity.

Composability.

Ease of debugging and testing.

Speed doesn't seem to be an advantage. If anything, I think because of the overhead of calling a function it will be slow (your empirical data supports this).

Also, sum is faster as it is implemented in c

Upvotes: -1

python: the speed of sum

Answers (3)

Related Questions