Reputation: 12410
I have a large list with numbers n1
in listn1
. I want to add all multiples of each number, if the multiplicand is not in another list listn2
(prime numbers with specific characteristics) and the product is below a maximum max_n
. The multiplicand is in most cases below 10, but can go up to 100,000. My code so far looks like:
s = 0
for n1 in listn1:
s += sum(n1 * i for i in range(1, 1 + max_n // n1) if i not in listn2)
The problem: this approach is sloooow. It takes seconds to calculate listn1
and listn2
, so I know that there are more than a million numbers to add. But I started the summation yesterday evening and it is still running this morning.
Is there a Pythonic way to speed up this summation?
Upvotes: 1
Views: 570
Reputation: 12410
Taking the suggestions on board, I re-wrote the code to
setn2 = set(listn2)
s = 0
for n1 in listn1:
s += n1 * (max_n * (max_n + 1) // 2 - sum(i for i in range(1, 1 + max_n // n1) if i in setn2))
Instead of hours, the summation now takes seconds.
Several hours later
Coming across this old question of mine, it turned out that retrospectively, I did the right thing. The idea mentioned here in this thread to use numpy.sum()
instead was well intended but wrong, as shown here. If you work in numpy, fine, but if you have a Python list, use comprehensions.
Upvotes: 0
Reputation: 4592
I have 2 suggestions for you.
First of all, you don't have multiply i
with n1
at each iteration. You can replace
s += sum(n1 * i for i in range(1, 1 + max_n // n1) if i not in listn2)
with
s += n1 * sum(i for i in range(1, 1 + max_n // n1) if i not in listn2)
They are totally same.
Secondly, without if i not in listn2
condition, you have a simple summation:
sum(i for i in range(1, 1 + max_n // n1)
This is same with sum([1, 2, 3, 4, 5, 6, 7, 8, ..., (max_n // n1)])
, and equals (max_n // n1) * (1 + max_n // n1) / 2
. For a simply example, take a look at this.
To handle if i not in listn2
condition, if your listn2
is smaller, you can sum listn2
instead of listn1
.
So, find the sum of listn1
and subtract the items in listn2
:
def sum_until(l, max):
return sum([x for x in l if x < max])
listn2 = list(set(listn2))
for n1 in listn1:
finish = max_n // n1
s += n1 * (finish * (finish + 1) / 2 - sum_until(listn2, finish))
EDIT:
I guess NumPy would be faster for sum. Make listn2
a numpy array:
import numpy as np
listn2 = np.array(list(set(listn2)))
And use this sum_until
function:
def sum_until(listn2, max):
l = listn2[np.where(listn2 <= max)]
return int(np.sum(l))
Upvotes: 1