J.Doe
J.Doe

Reputation: 69

Adding arrays which may contain 'None'-entries

I have a question regarding the addition of numpy arrays. Let's assume I have defined a function

def foo(a,b):
    return a+b

that takes two arrays of the same shape and simply returns their sum. Now, I have to deal with the cases that some of the entries may be None. I would like to deal with those entries as they correspond to float(0), such that

[1.0,None,2.0] + [1.0,2.0,2.0] 

would add up to

[2.0,2.0,4.0]

Can you provide me with an already-implemented solution?

TIA

Upvotes: 1

Views: 837

Answers (3)

sushain97
sushain97

Reputation: 2802

I suggest numpy.nan_to_num:

>>> np.nan_to_num(np.array([1.0,None,2.0], dtype=np.float))
array([ 1.,  0.,  2.])

Then,

>>> def foo(a,b):
...         return np.nan_to_num(a) + np.nan_to_num(b)
...
>>> foo(np.array([1.0,None,2.0], dtype=np.float), np.array([1.0,2.0,2.0], dtype=np.float))
array([ 2.,  0.,  4.])

Upvotes: 3

abarnert
abarnert

Reputation: 365945

Usually, the answer to this is to use an array of floats, rather than an array of arbitrary objects, and then use np.nan instead of None. NaN has well-defined semantics for arithmetic. (Also, using an array of floats instead of objects will make your code significantly more time and space efficient.)


Notice that you don't have to manually convert None to np.nan if you build the array with an explicit dtype of float or np.float64. Both of these are equivalent:

>>> a = np.array([1.0,np.nan,2.0])
>>> a = np.array([1.0,None,2.0],dtype=float)

Which means that if, for some reason, you really needed arrays of arbitrary objects with actual None in them, you could do that, and then convert it to an array of floats on the fly to get the benefits of NaN:

>>> a.astype(float) + b.astype(float)

At any rate, in this case, just using NaN isn't sufficient:

>>> a = np.array([1.0,np.nan,2.0])
>>> b = np.array([1.0,2.0,2.0])
>>> a + b
array([ 2., nan,  4.])

That's because the semantics of NaN are that the result of any operation with NaN returns NaN. But you want to treat it as 0.

But it does make the problem easy to solve. The simplest way to solve that is with the function nan_to_num:

>>> np.nan_to_num(a, 0)
array([1., 0., 2.0])
>>> np.nan_to_num(a, 0) + np.nan_to_num(b, 0)
array([2., 2., 4.])

Upvotes: 2

Kasravnd
Kasravnd

Reputation: 107347

You can use column_stack to concatenates both arrays along the second axis then use np.nansum() to sum items over the second axis.

In [15]: a = np.array([1.0,None,2.0], dtype=np.float)
# Using dtype here is necessary to convert None to np.nan

In [16]: b = np.array([1.0,2.0,2.0]) 

In [17]: np.nansum(np.column_stack((a, b)), 1)
Out[17]: array([2., 2., 4.])

Upvotes: 2

Related Questions