simedro
simedro

Reputation: 43

Numpy - writing a function in vector form?

I'm quite new to NumPy (or SciPy) and coming from Octave/Matlab, this seems a bit challenging to me.

I'm reading through the docs and writing some basic functions. I came across this section: Vectorizing functions (vectorize)

It defines this function:

def addsubtract(a, b):
   if a > b:
       return a - b
   else:
       return a + b

Then vectorizes it:

vec_addsubtract = np.vectorize(addsubtract)

But at the end, it says:

This particular function could have been written in vector form without the use of vectorize.

I wouldn't know any other way to write such function. So what is the vector form?

Upvotes: 3

Views: 535

Answers (2)

Mad Physicist
Mad Physicist

Reputation: 114310

np.vectorize is a glorified python for loop, which means that it effectively strips away any optimizations that numpy offers.

To actually vectorize addsubtract, we can use the fact that numpy offers three things: a vectorized add function, a vectorized subtract function, and all sorts of boolean mask operations.

The simplest, but least efficient, way to write this is using np.where:

np.where(a > b, a - b, a + b)

This is inefficient because it pre-computes a - b and a + b in all cases, and then selects from one or the other for each element.

A more efficient solution would only compute the values where the condition required it:

result = np.empty_like(a)
mask = a > b
np.subtract(a, b, where=mask, out=result)
np.add(a, b, where=~mask, out=result)

For very small arrays, the overhead of the complicated method makes it less worthwhile. But for large arrays, it's the fastest solution.

Fun fact: the page in the tutorial you are referencing will not be available in future versions of the SciPy tutorial exactly because it is an intro to NumPy, as explained in PR #12432.

Upvotes: 1

Niklas Mertsch
Niklas Mertsch

Reputation: 1489

You can do this with np.where, which computes both results (a-b and a+b) and selects the values depending on an boolean array (a>b):

def addsubtract(a, b):
    return np.where(a>b, a-b, a+b)

It can be seen as a vectorized ternary operator: "Where a>b, take the value from a-b, else take the value from a+b".

Despite computing both possible results, it was significantly faster than the vectorized if/else function you wrote (at least on my machine).

Upvotes: 1

Related Questions