Reputation: 1
I was working on Project Euler problem 12 and came up with the following solution:
import math
def main():
num = 0
j = 0
numFactors = 0
while numFactors <= 500:
j += 1
num = num + j #num = sum of numbers from 1 to j
#num = (j *(j+1))//2
numFactors = 0
for n in range(1,int(math.sqrt(num))):
if num % n == 0:
numFactors += 2
print(num)
if __name__ == '__main__':
from timeit import Timer
t = Timer(lambda: main())
print(t.timeit(number=1))
My question is about the line num = num + j
against num = (j *(j+1))//2
.
When using just the single addition, the program is consistently slower than if I uncomment the next line and do a multiplication, an addition and a division.
Can anyone explain why the single operation is slower than doing 3?
Upvotes: 0
Views: 93
Reputation: 6098
I don’t actually see any difference in the performance on my machine. I would also be moderately surprised if the multiplication actually was faster than the addition.
We can use the dis
module to disassemble the byte code from the two variants. Most of it is the same across two variants, so the performance in either function should be identical. These are the two sections that are different (line 11 is num = num + j
, line 12 is num = (j *(j+1))//2
):
11 43 LOAD_FAST 0 (num)
46 LOAD_FAST 1 (j)
49 BINARY_ADD
50 STORE_FAST 0 (num)
12 43 LOAD_FAST 1 (j)
46 LOAD_FAST 1 (j)
49 LOAD_CONST 3 (1)
52 BINARY_ADD
53 BINARY_MULTIPLY
54 LOAD_CONST 4 (2)
57 BINARY_FLOOR_DIVIDE
58 STORE_FAST 0 (num)
Three operations are present in both disassemblies. Here it is again, only showing the lines unique to each variant:
11 43 LOAD_FAST 0 (num)
12 43 LOAD_FAST 1 (j)
49 LOAD_CONST 3 (1)
53 BINARY_MULTIPLY
54 LOAD_CONST 4 (2)
57 BINARY_FLOOR_DIVIDE
If the multiplication is truly faster than the addition, then I expect somebody well-versed in byte code could explain why the LOAD_FAST
for num
is faster than the five operations for line 12. But to my layman’s eye, I’d expect the variant with more bytecode operations to take longer.
Upvotes: 1