Reputation: 137
I am fairly new to numpy. I have the following code, which gives me the desired result:
import numpy as np
def get_result(S,K,delS):
res=np.zeros(S.shape[0],dtype=np.float64)
for j in range(res.shape[0]):
if S[j]-K>delS:
res[j]+=np.floor((S[j]-K)/delS)
K+=np.floor((S[j]-K)/delS)*delS
elif S[j]-K<-delS:
res[j]+=np.ceil((S[j]-K)/delS)
K+=np.ceil((S[j]-K)/delS)*delS
return res
S=np.array([1.0,1.05,1.1,1.12,1.09,1.14,1.21,1.6,1.05,1.0,0.95,0.90,0.87,0.77,0.63,0.85,0.91,0.76],dtype=np.float64)
K=1.0
delS=0.1
l=get_result(S,K,delS)
for j in range(S.shape[0]):
print("%d\t%.2f\t%.0f" % (j,S[j],l[j]))
The get_result function contains a for-loop, however, and is therefore awkwardly slow for larger input vectors S. Can such a function be vectorized in numpy-syntax? Any help would be appreciated.
Upvotes: 0
Views: 3373
Reputation: 231385
A general pattern when dealing with 2 or more conditions in an array or calculation, is to construct a boolean mask, and take one action for elements where that mask is true, and different where it is false:
res=np.zeros(S.shape[0],dtype=np.float64)
mask - S-K>delS
res[mask] = ...S[mask]
res[~mask] = ...S[~mask]
A variant is to identify the indices, with np.where(mask)
.
But what appears to complicate your calculation is that test keeps changing. That is, K
for j+1
derives from calculation for j
.
for j in range(res.shape[0]):
if S[j]-K>delS:
res[j]+=np.floor((S[j]-K)/delS)
K+=np.floor((S[j]-K)/delS)*delS
elif S[j]-K<-delS:
res[j]+=np.ceil((S[j]-K)/delS)
K+=np.ceil((S[j]-K)/delS)*delS
To deal with that kind of iteration we usually try to use np.cumsum
or other accumulative ufunc
methods.
As a general rule, numpy
code is fastest and easiest when the calculation can be applied to all elements of an array (or set of arrays) 'in parallel' - that is, in a way that does not depend on the order of iteration. Then we can delegate the action to fast compiled numpy functions like array addition and multiplication. If the computation is serial in nature (the value for j depending on j-1) this becomes trickier.
So if my cursory reading is right, it isn't the if-statement that is difficult, it's the serial nature of the loop.
=========================
Playing around, I was able to remove the if
(there are actually 3 subcase), but it still has to be iterative:
def get_result(S,K,delS):
S = S/delS
res=np.zeros(S.shape[0],dtype=np.float64)
for j in range(res.shape[0]):
x = S[j] - K/delS
xa = np.floor(np.abs(x)) * np.sign(x)
res[j] += xa
K += xa*delS
return res
Upvotes: 2