Reputation: 1201
I am trying to perform addition in an efficient way in python over large loops . I am trying to loop over a range of 100000000.
from datetime import datetime
start_time = datetime.now()
sum = 0
for i in range(100000000):
sum+=i
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(sum)
The output from the above code is --- %s seconds ---0:00:16.662666 4999999950000000
When i try to do it in C, its taking 0.43 seconds
From what i read, python creates new memory everytime when you perform addition to variable. I read some articles and came to know how to perform string concatenation in these situations by avoiding '+' sign . But i dont find anything how to do with integers.
Upvotes: 5
Views: 315
Reputation: 35098
Here is a comparison of three methods: your original way, using sum(range(100000000))
as suggested by Alex Metsai, and using the NumPy numerical library's sum
and range
functions:
from datetime import datetime
import numpy as np
def orig():
start_time = datetime.now()
sum = 0
for i in range(100000000):
sum+=i
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(sum)
def pyway():
start_time = datetime.now()
mysum = sum(range(100000000))
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(mysum)
def npway():
start_time = datetime.now()
sum = np.sum(np.arange(100000000))
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(sum)
On my computer, I get:
>>> orig()
--- %s seconds ---0:00:09.504018
4999999950000000
>>> pyway()
--- %s seconds ---0:00:02.382020
4999999950000000
>>> npway()
--- %s seconds ---0:00:00.683411
4999999950000000
NumPy is the fastest, if you can use it in your application.
But, as suggested by Ethan in a comment, it's worth pointing out that calculating the answer directly is by far the fastest:
def mathway():
start_time = datetime.now()
mysum = 99999999*(99999999+1)/2
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(mysum)
>>> mathway()
--- %s seconds ---0:00:00.000013
4999999950000000.0
I assume your actual problem is not so easily solved by pencil and paper :)
Upvotes: 5
Reputation: 1950
Consider using the sum()
function if you can process the list as a whole, which loops entirely in C code and is much faster, and also avoids the creation of new Python objects.
sum(range(100000000))
In my computer, your code takes 07.189210
seconds, while the above statement takes 02.751251
seconds, increasing the processing speed more than 3 times.
Edit: as suggested by mtrw, numpy.sum() can speed up processing even more.
Upvotes: 6