Reputation: 43
I'm quite new to NumPy (or SciPy) and coming from Octave/Matlab, this seems a bit challenging to me.
I'm reading through the docs and writing some basic functions. I came across this section: Vectorizing functions (vectorize)
It defines this function:
def addsubtract(a, b):
if a > b:
return a - b
else:
return a + b
Then vectorizes it:
vec_addsubtract = np.vectorize(addsubtract)
But at the end, it says:
This particular function could have been written in vector form without the use of vectorize.
I wouldn't know any other way to write such function. So what is the vector form?
Upvotes: 3
Views: 535
Reputation: 114310
np.vectorize
is a glorified python for
loop, which means that it effectively strips away any optimizations that numpy offers.
To actually vectorize addsubtract
, we can use the fact that numpy offers three things: a vectorized add
function, a vectorized subtract
function, and all sorts of boolean mask operations.
The simplest, but least efficient, way to write this is using np.where
:
np.where(a > b, a - b, a + b)
This is inefficient because it pre-computes a - b
and a + b
in all cases, and then selects from one or the other for each element.
A more efficient solution would only compute the values where the condition required it:
result = np.empty_like(a)
mask = a > b
np.subtract(a, b, where=mask, out=result)
np.add(a, b, where=~mask, out=result)
For very small arrays, the overhead of the complicated method makes it less worthwhile. But for large arrays, it's the fastest solution.
Fun fact: the page in the tutorial you are referencing will not be available in future versions of the SciPy tutorial exactly because it is an intro to NumPy, as explained in PR #12432.
Upvotes: 1
Reputation: 1489
You can do this with np.where
, which computes both results (a-b
and a+b
) and selects the values depending on an boolean array (a>b
):
def addsubtract(a, b):
return np.where(a>b, a-b, a+b)
It can be seen as a vectorized ternary operator: "Where a>b, take the value from a-b, else take the value from a+b".
Despite computing both possible results, it was significantly faster than the vectorized if/else function you wrote (at least on my machine).
Upvotes: 1