Piyushkumar Patel
Piyushkumar Patel

Reputation: 199

Avoid NaN values and add two matrices element wise in python

I have two 4D matrices, which I would like to add. The matrices have the exact same dimension and number of elements, but they both contain randomly distributed NaN values.

I would prefer to add them as below using numpy.nansum.
(1) if two values are added I want the sum to be a value,
(2) if a value and a NaN are added I want the sum to be the value and
(3) if two NaN are added I want the sum to be NaN.

Herewith what I tried

a[6x7x180x360]
b[6x7x180x360]

C=np.nansum[(a,b)]
C=np.nansum(np.dstack((a,b)),2)

But I am unable to get the resultant matrix with same dimension as input. It means resultant matrix C should be in [6x7x180x360]. Anyone can help in this regard. Thank you in advance.

Upvotes: 6

Views: 2587

Answers (3)

unutbu
unutbu

Reputation: 879671

You could use np.stack((a,b)) to stack along a new 0-axis, then call nansum to sum along that 0-axis:

C = np.nansum(np.stack((a,b)), axis=0)

For example,

In [34]: a = np.random.choice([1,2,3,np.nan], size=(6,7,180,360))

In [35]: b = np.random.choice([1,2,3,np.nan], size=(6,7,180,360))

In [36]: np.stack((a,b)).shape
Out[36]: (2, 6, 7, 180, 360)

In [37]: np.nansum(np.stack((a,b)), axis=0).shape
Out[37]: (6, 7, 180, 360)

You had the right idea, but np.dstack stacks along the third axis, which is not desireable here since you already have 4 axes:

In [31]: np.dstack((a,b)).shape
Out[31]: (6, 7, 360, 360)

Regarding your point (3): Note that the behavior of np.nansum depends on the NumPy version:

In NumPy versions <= 1.8.0 Nan is returned for slices that are all-NaN or empty. In later versions zero is returned.

If you are using NumPy version > 1.8.0, then you may have to use a solution such as Maarten Fabré's to address this issue.

Upvotes: 9

Maarten Fabr&#233;
Maarten Fabr&#233;

Reputation: 7058

I think the easiest way is to use np.where

result = np.where(
    np.isnan(a+b),
    np.where(np.isnan(a), b, a), 
    a+b
)

This reads as: if a+b is not nan, use a+b, else use a, unless it is nan, then use b. Whether or b is nan is of little consequence then.

Alternatively, you can use it like this:

result2 = np.where(
    np.isnan(a) & np.isnan(b),
    np.nan,
    np.nansum(np.stack((a,b)), axis=0)
)

np.testing.assert_equal(result, result2) passes

Upvotes: 1

William Laroche
William Laroche

Reputation: 151

I believe the function np.nansum is not appropriate in your case. If I understand your question correctly, you wish to do an element-wise addition of two matrices with a little of logic regarding the NaN values.

Here is the full example on how to do it:

import numpy as np

a = np.array([  [np.nan, 2],
                [3, np.nan]])

b = np.array([  [3, np.nan],
                [1, np.nan]])

result = np.add(a,b)

a_is_nan = np.isnan(a)
b_is_nan = np.isnan(b)

result_is_nan = np.isnan(result)

mask_a = np.logical_and(result_is_nan, np.logical_not(a_is_nan))
result[mask_a] = a[mask_a]

mask_b = np.logical_and(result_is_nan, np.logical_not(b_is_nan))
result[mask_b] = b[mask_b]

print(result)

A little bit of explanation:

The first operation is np.add(a,b). This adds both matrices and any NaN element will produce a result of NaN also.

To select the NaN values from either arrays, we use a logical mask:

# result_is_nan is a boolean array containing True whereve the result is np.NaN. This occurs when any of the two element were NaN
result_is_nan = np.isnan(result)

# mask_a is a boolean array which 'flags' elements that are NaN in result but were not NaN in a !
mask_a = np.logical_and(result_is_nan, np.logical_not(a_is_nan))
# Using that mask, we assign those value to result
result[mask_a] = a[mask_a]

There you have it !

Upvotes: 2

Related Questions