Reputation: 1
Im trying to run this kind of loop (its simplified in this example) that generates and adds up random consumption´s for 1000 clients, which takes approximately 1h30h.
import numpy as np
rand_array = np.random.rand(35000)
total_consumption = np.zeros(35000)
for t in range(0,1000):
consumption = np.zeros(35000)
consumption[0] = 0.5
rand_array = np.random.rand(35000)
for i in range(1,35000):
consumption[i] = rand_array[i] * consumption[i-1]
total_consumption = total_consumption + consumption
Is there a way I can make this faster and more efficient? I tried to use list comprehension to no avail
Upvotes: 0
Views: 126
Reputation: 208003
I had a try at doing the middle part with numba
:
import numba
from numba import jit
@jit(nopython=True)
def speedy(consumption, rand_array):
for i in range(35000):
consumption[i] = rand_array[i] * consumption[i-1]
return consumption
rand_array = np.random.rand(35000)
total_consumption = np.zeros(35000)
for t in range(0,1000):
consumption = np.zeros(35000)
consumption[0] = 0.5
rand_array = np.random.rand(35000)
consumption = speedy(consumption, rand_array)
total_consumption = total_consumption + consumption
The time was 259 ms versus 9.6 seconds for your code. I guess you could do more in numba
too if you wanted to try.
Upvotes: 2
Reputation: 50901
You can use np.cumprod
to vectorize the computation and make it much faster. Here is the resulting code:
total_consumption = np.zeros(35000)
for t in range(0,1000):
rand_array = np.random.rand(35000)
rand_array[0] = 0.5 # Needed for the cumprod
consumption = np.cumprod(rand_array)
total_consumption += consumption
This code takes 267 milliseconds on my machine while the original one takes 11.8 seconds. Thus, it is about 44 time faster.
Upvotes: 2