GeoMonkey
GeoMonkey

Reputation: 1671

Comparing values in two numpy arrays with 'if'

Im fairly new to numpy arrays and have encountered a problem when comparing one array with another.

I have two arrays, such that:

a = np.array([1,2,3,4,5])
b = np.array([2,4,3,5,2])

I want to do something like the following:

if b > a:
    c = b
else:
    c = a

so that I end up with an array c = np.array([2,4,3,5,5]).

This can be otherwise thought of as taking the max value for each element of the two arrays.

However, I am running into the error

ValueError: The truth value of an array with more than one element is ambiguous. 
Use a.any() or a.all(). 

I have tried using these but Im not sure that the are right for what I want.

Is someone able to offer some advice in solving this?

Upvotes: 6

Views: 12897

Answers (5)

Gursel Karacor
Gursel Karacor

Reputation: 1167

May not be the most efficient one but this is a more suitable answer to the original question:

import numpy as np

c = np.zeros(shape=(5,1))

a = np.array([1,2,3,4,5])
b = np.array([2,4,3,5,2])

for i in range(5):
    if b.item(i) > a.item(i):
        c[i] = b.item(i)
    else:
        c[i] = a.item(i) 

Upvotes: 0

Nras
Nras

Reputation: 4311

You are looking for the function np.fmax. It takes the element-wise maximum of the two arrays, ignoring NaNs.

import numpy as np
a = np.array([1, 2, 3, 4, 5])
b = np.array([2, 4, 3, 5, 2])
c = np.fmax(a, b)

The output is

array([2, 4, 3, 5, 5])

Upvotes: 15

Ashoka Lella
Ashoka Lella

Reputation: 6729

Here's an other way of achieving this

c = np.array([y if y>z else z for y,z in zip(a,b)])

Upvotes: 2

skyuuka
skyuuka

Reputation: 685

The following methods also work:

  1. Use numpy.maximum

    >>> np.maximum(a, b)

  2. Use numpy.max and numpy.vstack

    >>> np.max(np.vstack(a, b), axis = 0)

Upvotes: 1

abarnert
abarnert

Reputation: 366143

As with almost everything else in numpy, comparisons are done element-wise, returning a whole array:

>>> b > a
array([ True,  True, False,  True, False], dtype=bool)

So, is that true or false? What should an if statement do with it?

Numpy's answer is that it shouldn't try to guess, it should just raise an exception.

If you want to consider it true because at least one value is true, use any:

>>> if np.any(b > a): print('Yes!')
Yes!

If you want to consider it false because not all values are true, use all:

>>> if np.all(b > a): print('Yes!')

But I'm pretty sure you don't want either of these. You want to broadcast the whole if/else over the array.

You could of course wrap the if/else logic for a single value in a function, then explicitly vectorize it and call it:

>>> def mymax(a, b):
...     if b > a:
...         return b
...     else:
...         return a
>>> vmymax = np.vectorize(mymax)
>>> vmymax(a, b)
array([2, 4, 3, 5, 5])

This is worth knowing how to do… but very rarely worth doing. There's usually a more indirect way to do it using natively-vectorized functions—and often a more direct way, too.


One way to do it indirectly is by using the fact that True and False are numerical 1 and 0:

>>> (b>a)*b + (b<=a)*a
array([2, 4, 3, 5, 5])

This will add the 1*b[i] + 0*a[i] when b>a, and 0*b[i] + 1*a[i] when b<=a. A bit ugly, but not too hard to understand. There are clearer, but more verbose, ways to write this.

But let's look for an even better, direct solution.


First, notice that your mymax function will do exactly the same as Python's built-in max, for 2 values:

>>> vmymax = np.vectorize(max)
>>> vmymax(a, b)
array([2, 4, 3, 5, 5])

Then consider that for something so useful, numpy probably already has it. And a quick search will turn up maximum:

>>> np.maximum(a, b)
array([2, 4, 3, 5, 5])

Upvotes: 6

Related Questions