Pramod Kumar
Pramod Kumar

Reputation: 1201

How to efficiently perform addition over large loops in python

I am trying to perform addition in an efficient way in python over large loops . I am trying to loop over a range of 100000000.

from datetime import datetime

start_time = datetime.now()
sum = 0
for i in range(100000000):
    sum+=i
end_time = datetime.now()
print('--- %s seconds ---{}'.format(end_time - start_time))
print(sum)

The output from the above code is --- %s seconds ---0:00:16.662666 4999999950000000

When i try to do it in C, its taking 0.43 seconds

From what i read, python creates new memory everytime when you perform addition to variable. I read some articles and came to know how to perform string concatenation in these situations by avoiding '+' sign . But i dont find anything how to do with integers.

Upvotes: 5

Views: 315

Answers (2)

mtrw
mtrw

Reputation: 35098

Here is a comparison of three methods: your original way, using sum(range(100000000)) as suggested by Alex Metsai, and using the NumPy numerical library's sum and range functions:

from datetime import datetime
import numpy as np

def orig():
    start_time = datetime.now()
    sum = 0
    for i in range(100000000):
        sum+=i
    end_time = datetime.now()
    print('--- %s seconds ---{}'.format(end_time - start_time))
    print(sum)

def pyway():
    start_time = datetime.now()
    mysum = sum(range(100000000))
    end_time = datetime.now()
    print('--- %s seconds ---{}'.format(end_time - start_time))
    print(mysum)

def npway():
    start_time = datetime.now()
    sum = np.sum(np.arange(100000000))
    end_time = datetime.now()
    print('--- %s seconds ---{}'.format(end_time - start_time))
    print(sum)

On my computer, I get:

>>> orig()
--- %s seconds ---0:00:09.504018
4999999950000000
>>> pyway()
--- %s seconds ---0:00:02.382020
4999999950000000
>>> npway()
--- %s seconds ---0:00:00.683411
4999999950000000

NumPy is the fastest, if you can use it in your application.

But, as suggested by Ethan in a comment, it's worth pointing out that calculating the answer directly is by far the fastest:

def mathway():
    start_time = datetime.now()
    mysum = 99999999*(99999999+1)/2
    end_time = datetime.now()
    print('--- %s seconds ---{}'.format(end_time - start_time))
    print(mysum)


>>> mathway()
--- %s seconds ---0:00:00.000013
4999999950000000.0

I assume your actual problem is not so easily solved by pencil and paper :)

Upvotes: 5

Alex Metsai
Alex Metsai

Reputation: 1950

Consider using the sum() function if you can process the list as a whole, which loops entirely in C code and is much faster, and also avoids the creation of new Python objects.

sum(range(100000000))

In my computer, your code takes 07.189210 seconds, while the above statement takes 02.751251 seconds, increasing the processing speed more than 3 times.

Edit: as suggested by mtrw, numpy.sum() can speed up processing even more.

Upvotes: 6

Related Questions