Reputation: 19
I'm trying to create a Monte Carlo simulation to simulate future stock prices using Numpy arrays.
My current approach is: create a For Loop which fills an array, stock_price_array, with simulated stock prices. These stock prices are generated by taking the last stock price, then multiplying it by 1 + an annual return. The annual returns are drawn randomly from a normal distribution and stored in the array annual_ret.
My problem is that although the "stock price" variables I print from my For Loop appear to be correct, I simply cannot figure out how to Append these stock price variables to stock_price_array.
I've tried various methods, including initializing the stock_price_array using .full instead of .empty, changing the order of where the array appears in the For Loop, and checking the size of the array.
I've read other Stack Overflow posts on similar topics but can't figure out what I'm doing wrong.
Thank you in advance for your help!
annual_mean = .06
annual_stdev = .15
start_stock_price = 100
numYears = 3
numSimulations = 4
stock_price_array = np.empty(numYears)
# draw an annual return from a normal distribution; this annual return will be random
annual_ret = np.random.normal(annual_mean, annual_stdev, numSimulations)
for i in range(numYears):
stock_price = np.multiply(start_stock_price, (1 + annual_ret[i]))
np.append(stock_price_array, [stock_price])
start_stock_price = stock_price
Upvotes: 0
Views: 127
Reputation: 13346
The 1st rule of numpy is: never iterate your array yourself. Use numpy function that does all the computation in batch (and for doing so, they iterate the array, sure. But that iteration is not a python iteration, so it is way faster).
For example, here, you could do something like this
np.cumprod(np.hstack([start_stock_price, annual_ret+1]))
What it does is 1st building an array of a initial value, and some factors.
So if initial value is 100, and interest rate are 0.1, -0.1, 0.2, 0.2 (for example), then hstack
build and array of values 100, 1.1, 0.9, 1.2, 1.2
.
And the cumprod
just build the cumulative product of those
100, 100×1.1=110, 100×1.1×0.9=110×0.9=99, 100×1.1×0.9×1.2=99×1.2=118.8, 100×1.1×0.9×1.2×1.2=118.8×1.2=142.56
To answer to your initial question anyway (even if I strongly advise that you try to use solutions like the usage of cumprod
I've shown), you have 2 choices:
Either you allocate in advance an array, as you did (your stock_price_array = np.empty(numYears)
). And then, instead of trying to append the new stock_price
to stock_price_array
, you should simply fill one of the empty place that are already there. By simply doing stock_price_array[i] = stock_price
Or you don't. And then you replace the np.empty
line by a stock_price_array=[]
. And then, at each step, you do append the result to create a new stock_price_array
, like this stock_price_array = np.append(stock_price_array, [stock_price])
I strongly advise against the 2nd solution. Since you already know the final size of the array, it is way better to create it once. Because np.append
recreate a brand new array, then copies the input data it it. It does not just extend the existing array (generally speaking, we can't do that anyway).
But, well, anyway, I advise against both solution, since I find mine (with cumprod) preferable. for
is the taboo word in numpy. And it is even more so, when what inside this for is the creation of a new array, like append
is.
Since you've mentioned Monte-Carlo, and then shown a code that compute only one result (you draw 1 set of annual ret, and perform one computation of future values), I am wondering if that is really what you want.
In particular, I see that you have numSimulation
and numYears
, that appear to be playing redundant roles in your code (and therefore in mines).
The only reason why it doesn't just throw a index error, is because numSimulation
is used only to decide how many annual_ret
you draw. And since numSimulation > numYears
, you have more than enough annual_ret to compute the result.
Wasn't your initial intention to redo the simulation over the years numSimulation
time, to have numSimulation
results ?
In which case, you probably need numSimulation
sets of numYears
annual rate. So a 2D array. And like wise, you should be computing numSimulation
series of numYears
results.
If my guess is not completely off, I surmise that what you really wanted to do was rather in the effect of:
annual_ret = np.random.normal(annual_mean, annual_stdev, (numSimulations, numYears)) # 2d array of interest rate. 1 simulation per row, 1 year per column
t = np.pad(annual_ret+1, ((0,0), (1,0)), constant_values=start_stock_price) # Add 1 as we did earlier. And pad with an initial 100 (`start_stock_price`) at the beginning of each simulation
res = np.cumprod(t, axis=1) # cumulative multiplication. `axis=1` means that it is done along axis 1 (along years) for each row (for each simulation)
Upvotes: 2