JackLidge
JackLidge

Reputation: 441

Replace zero values in an xarray

I have an xarray dataset with three separate 4x4 matrices, currently filled with random values.

I can mask out each 4x4 matrix so that all values which are equal to zero are nan, and I would like to replace those nan values with the value from the next matrix down.

This will eventually be expanded to very large arrays of satellite imagery so I can perform searches and create imagery based off the "last best pixel". Below is the code I'm currently using for reference:

import numpy as np
import xarray as xr

dval = np.random.randint(5,size=[3,4,4])

x = [0,1,2,3]
y = [0,1,2,3]
time = ['2017-10-13','2017-10-12','2017-10-11']

a = xr.DataArray(dval,coords=[time,x,y],dims=['time','x','y'])

a = a.where(a > 0)
b = a.sel(time = time[0]).to_masked_array()

What I'd like to do is have any values masked False in b be replaced with values from the 4x4 matrix corresponding to '2017-10-12'. Any help with this would be greatly appreciated.

Upvotes: 2

Views: 11173

Answers (1)

shoyer
shoyer

Reputation: 9603

You can do forward and backward filling by making using of the ffill() and bfill() methods, e.g.,

import numpy as np
import xarray as xr

dval = np.random.RandomState(0).randint(5,size=[3,4,4])

x = [0,1,2,3]
y = [0,1,2,3]
time = ['2017-10-13','2017-10-12','2017-10-11']

a = xr.DataArray(dval,coords=[time,x,y],dims=['time','x','y'])
a = a.where(a > 0)
filled = a.bfill('time')

Results in:

>>> a
<xarray.DataArray (time: 3, x: 4, y: 4)>
array([[[ 4., nan,  3.,  3.],
        [ 3.,  1.,  3.,  2.],
        [ 4., nan, nan,  4.],
        [ 2.,  1., nan,  1.]],

       [[ 1., nan,  1.,  4.],
        [ 3., nan,  3., nan],
        [ 2.,  3., nan,  1.],
        [ 3.,  3.,  3., nan]],

       [[ 1.,  1.,  1., nan],
        [ 2.,  4.,  3.,  3.],
        [ 2.,  4.,  2., nan],
        [nan,  4., nan,  4.]]])
Coordinates:
  * time     (time) <U10 '2017-10-13' '2017-10-12' '2017-10-11'
  * x        (x) int64 0 1 2 3
  * y        (y) int64 0 1 2 3

>>> filled
<xarray.DataArray (time: 3, x: 4, y: 4)>
array([[[ 4.,  1.,  3.,  3.],
        [ 3.,  1.,  3.,  2.],
        [ 4.,  3.,  2.,  4.],
        [ 2.,  1.,  3.,  1.]],

       [[ 1.,  1.,  1.,  4.],
        [ 3.,  4.,  3.,  3.],
        [ 2.,  3.,  2.,  1.],
        [ 3.,  3.,  3.,  4.]],

       [[ 1.,  1.,  1., nan],
        [ 2.,  4.,  3.,  3.],
        [ 2.,  4.,  2., nan],
        [nan,  4., nan,  4.]]])
Coordinates:
  * time     (time) <U10 '2017-10-13' '2017-10-12' '2017-10-11'
  * x        (x) int64 0 1 2 3
  * y        (y) int64 0 1 2 3

The related interpolate_na() method can also be handy for these situations (but not in this particular case).

Upvotes: 4

Related Questions