Optimizing generating a list of sums in Python

Question

I am attempting to use Python for the following task: given a set of integers S, produce S + S, the set of integers expressible as s1 + s2 for s1, s2 members of S (not necessarily distinct).

I am using the following code:

def sumList(l):
    # generates a list of numbers which are sums of two elements of l
    sumL = []
    howlong = len(l)
    for i in range(howlong):
        for j in range(i+1):
            if not l[i]+l[j] in sumL:
                sumL.append(l[i]+l[j])
    return sumL

This works fine for short enough lists, but when handed a longer list (say, 5000 elements between 0 and 20000) goes incredibly slowly (20+ minutes).

Question: what is making this slow? My guess is that asking whether the sum is already a member of the list is taking some time, but I am a relative newcomer to both Python and programming, so I am not sure. I am also looking for suggestions on how to perform the task of producing S + S in a quick fashion.

Patrick Haugh · Accepted Answer

Python has a built-in type set that has very fast lookups. You can't store duplicates or unhashable objects in a set, but since you want a set of integers, it's perfect for your needs. In the below, I also use itertools.product to generate the pairs.

from itertools import product

def sums(l):
    return {x+y for x, y in product(l, repeat=2)}

print(sums([1, 2, 3, 4]))
# {2, 3, 4, 5, 6, 7, 8}

As to why your existing solution is so slow, you might want to look up the term "algorithmic complexity". Basically, it's a way of categorizing algorithms into general groups based on how well they scale to many inputs. Your algorithm is a O(n^3) algorithm (it will do about n^3 comparisons). In comparison, the set solution is O(n^2). It accomplished this by discarding the need to check if a particular sum is already in the set.

Optimizing generating a list of sums in Python

Answers (1)

Related Questions