Reputation: 5068
I try to fill in data into a numpy array. However, for higher indexes it takes more and more time. Why?
And how can I prevent that? I already have created the arrays in the final dimensions...
import random
import numpy as np
# p = [ ... 2200 values in a python list ... ]
iterations = 1000
max_draws = len(p)-1
percentiles = np.zeros(max_draws)
money_list = np.zeros(iterations)
invest = 100
for k in range(1,max_draws):
print(k)
for j in range(0,iterations):
money_list[j] = (invest * np.random.choice(p, k)).sum()
percentiles[k] = np.percentile(money_list, 5)
I have a list of factors p
that represent gains from a trade at the stock market. Now I want to find out how many of that trades I must do (taken from the list of possible trades) so that with 95 % propability I make money and not lose money (given that if I make all the trades I make money and not lose it).
Upvotes: 1
Views: 119
Reputation: 16737
After all suggested improvements one more very efficient improvement can be done.
If you don't mind installing and using quite heavy extra python pip package numba (by python -m pip install numba
) then you can improve speed considerably, like in next code.
Numba is designed to precompile Python's functions to efficient machine code, also it is designed to be used with NumPy. It converts python loops to fast C code and compiles it using LLVM.
Next code achieves speedups of 4.18x
times for 2199
iterations of outer loop like in your code, and up to 100x
times speedup for few 5-20 iterations. All 2199 iterations for your case using Numba where done in 90 second on my slow PC.
Try next code here online too!
# Needs: python -m pip install numpy numba
import random, numpy as np, numba, timeit
p = np.random.random((2200,)) # or do p = np.array(p) if p is a list
iterations = 1000
max_draws = len(p) - 1
invest = 100
def do_regular(hi):
percentiles = np.zeros(max_draws)
money_list = np.zeros(iterations)
for k in range(1, hi):
for j in range(0,iterations):
money_list[j] = (invest * np.random.choice(p, k)).sum()
percentiles[k] = np.percentile(money_list, 5)
return percentiles, money_list
do_numba = numba.jit(nopython = True)(do_regular)
do_numba(2) # Pre-compile, heat up
for hi in [8, 16, 32, 64, 128, 256, 512, max_draws]: #max_draws
tr = timeit.timeit(lambda: do_regular(hi), number = 1)
tn = timeit.timeit(lambda: do_numba(hi), number = 1)
print(str(hi).rjust(4), 'regular', round(tr, 3), 'sec')
print(str(hi).rjust(4), 'numba', round(tn, 3), 'sec, speedup', round(tr / tn, 2), flush = True)
outputs:
8 regular 0.604 sec
8 numba 0.005 sec, speedup 131.2
16 regular 1.296 sec
16 numba 0.013 sec, speedup 101.36
32 regular 2.672 sec
32 numba 0.034 sec, speedup 78.18
64 regular 5.515 sec
64 numba 0.113 sec, speedup 48.87
128 regular 11.3 sec
128 numba 0.374 sec, speedup 30.19
256 regular 23.758 sec
256 numba 1.35 sec, speedup 17.59
512 regular 51.767 sec
512 numba 5.086 sec, speedup 10.18
2199 regular 376.327 sec
2199 numba 90.104 sec, speedup 4.18
Upvotes: 1