unamed19
unamed19

Reputation: 1

How can I make this for loop more efficient and faster

Im trying to run this kind of loop (its simplified in this example) that generates and adds up random consumption´s for 1000 clients, which takes approximately 1h30h.

import numpy as np

rand_array = np.random.rand(35000)
total_consumption = np.zeros(35000)

for t in range(0,1000):
   consumption = np.zeros(35000)
   consumption[0] = 0.5
   rand_array = np.random.rand(35000)

   for i in range(1,35000):
      consumption[i] = rand_array[i] * consumption[i-1]

   total_consumption = total_consumption + consumption

Is there a way I can make this faster and more efficient? I tried to use list comprehension to no avail

Upvotes: 0

Views: 126

Answers (2)

Mark Setchell
Mark Setchell

Reputation: 208003

I had a try at doing the middle part with numba:

import numba
from numba import jit

@jit(nopython=True)
def speedy(consumption, rand_array):
    for i in range(35000):
        consumption[i] = rand_array[i] * consumption[i-1]
    return consumption

rand_array = np.random.rand(35000)
total_consumption = np.zeros(35000)

for t in range(0,1000):
    consumption = np.zeros(35000)
    consumption[0] = 0.5
    rand_array = np.random.rand(35000)

    consumption = speedy(consumption, rand_array)
    total_consumption = total_consumption + consumption

The time was 259 ms versus 9.6 seconds for your code. I guess you could do more in numba too if you wanted to try.

Upvotes: 2

Jérôme Richard
Jérôme Richard

Reputation: 50901

You can use np.cumprod to vectorize the computation and make it much faster. Here is the resulting code:

total_consumption = np.zeros(35000)

for t in range(0,1000):
    rand_array = np.random.rand(35000)
    rand_array[0] = 0.5 # Needed for the cumprod
    consumption = np.cumprod(rand_array)
    total_consumption += consumption

This code takes 267 milliseconds on my machine while the original one takes 11.8 seconds. Thus, it is about 44 time faster.

Upvotes: 2

Related Questions