josibake
josibake

Reputation: 63

Out of curiosity: Which is better for creating a sum? sum(list) vs accumulation

I am calculating the total distance of a route (for a traveling salesman problem) and I am curious about which is better: summing a list of integers, or using total += value (not sure what the technical term for this is.. concatenation I believe?). In other words:

totalDistance = [distance(location, location+1) for location in route]
return sum(totalDistance)

or

totalDistance = 0
for location in route:
    totalDistance += distance(location, location+1)
return totalDistance

distance() returns an int value, and the number of locations varies between about 0 to 100 for different routes. Thoughts on either method (or a completely different way) appreciated!

Edit:

Accumulation , not Concatenation.

Upvotes: 0

Views: 94

Answers (3)

Yankai Zhang
Yankai Zhang

Reputation: 171

On my machine, iterate through a generator is slower than a list:

from timeit import timeit

# just for simulation
def distance(la, lb):
    return abs(la-lb)

# for simulation, too    
def route():
    return xrange(10000)

# the longer(code) and faster but wasteful(memory) way
def longer_faster_wasteful():
    return sum([distance(location, location+1) for location in route()])

# the shorter(code) and saving(memory) but slower way
def shorter_slower_saving():
    return sum(distance(location, location+1) for location in route())

# print 2.01163072818 on my machine
print timeit(longer_faster_wasteful, number=100)

# print 2.91834802689 on my machine
print timeit(shorter_slower_saving, number=100)

Upvotes: 0

user2555451
user2555451

Reputation:

Why not use a generator expression with sum:

return sum(distance(location, location+1) for location in route)

This solution avoids creating an unnecessary list like the first solution (saves on memory consumption) and is also a lot more concise than the second (cleanliness counts).

That said, you could always merge the first solution into a one-liner:

return sum([distance(location, location+1) for location in route])

But then, as I said above, why create a list just to throw it away?

Upvotes: 2

John La Rooy
John La Rooy

Reputation: 304355

Best is to just use a generator expression with sum

return sum(distance(location, location+1) for location in route)

This saves the overhead of creating a list. totalDistance = [distance(location, location+1) for location in route]

The for loop version also doesn't create a list. It's fine, just a little verbose compared to using sum. sum exists precisely for cases like this

How does location+1 work? Seems like it should be the next item from route


For interest I compared @iCodez examples in PyPy 2.2.1. First run of each function is to allow the JIT to compile the function

>>>> from timeit import timeit
>>>> def f():
....     number = 0
....     for i in range(100):
....         number += 1
.... 
>>>> timeit(f)
0.3245859146118164
>>>> timeit(f)
0.2913198471069336

>>>> def g():
....     lst = [i for i in range(100)]
....     sum(lst)
.... 
>>>> timeit(g)
0.8840188980102539
>>>> timeit(g)
0.8698201179504395

>>>> def h():
....     sum(i for i in range(100))
.... 
>>>> timeit(h)
2.8281970024108887
>>>> timeit(h)
2.8702847957611084

Wow..genexp performance is much worse

Upvotes: 1

Related Questions