Monobakht
Monobakht

Reputation: 303

Interpolate dataArray missing data using xarray

I am using xarray/rasterio to do some operations on a set of GeoTiff files. The data that I am dealing with contain missing values (-9999) at some grid points. Is there any efficient pythonic way to replace these values with interpolated data using xarray?

For example I have something like:

>>> da = xr.DataArray([[1.0,2.0,3.0],[4.0,-999.0,6.0],[7.0,-999.0,9.0]],[('x',[1,2,3]),('y',[1,2,3])])
>>> da
<xarray.DataArray (x: 3, y: 3)>
array([[   1.,    2.,    3.],
       [   4., -9999.,    6.],
       [   7., -9999.,    9.]])
Coordinates:
  * x        (x) int64 1 2 3
  * y        (y) int64 1 2 3

and I want to replace -9999 values with the interpolated values based on adjacent grid points. Any hint would be appreciated!

Upvotes: 3

Views: 3963

Answers (3)

Plagioclase
Plagioclase

Reputation: 17

I still do not think there is an efficient way to do this in xarray. The suggestion by @bwc will not work properly for multidimensional arrays, because the first pass along the first dimension will already fill all gaps. This may be acceptable if your missing values are just single datapoints, but it will result in unwanted behaviour if you have larger gaps.

The simplest solution I'm aware of for your problem is using the setmisstonn or setmisstodis operator in CDO (depending on how you want to interpolate).

Upvotes: 0

MuellerSeb
MuellerSeb

Reputation: 859

For 2D interpolation you could use rioxarray.

Upvotes: 1

bwc
bwc

Reputation: 1115

You can use da.interpolate_na() to do one dimensional interpolation of NA values - I am unsure if there is a two dimensional method.

You can convert your -999 values to NaN's and then interpolate.

da = xr.DataArray([[1.0,2.0,3.0],[4.0,-999.0,6.0],[7.0,-999.0,9.0]],[('x',[1,2,3]),('y',[1,2,3])])
da

<xarray.DataArray (x: 3, y: 3)>
array([[   1.,    2.,    3.],
       [   4., -999.,    6.],
       [   7., -999.,    9.]])
Coordinates:
  * x        (x) int64 1 2 3
  * y        (y) int64 1 2 3

new_da = da.where(da != -999.0, np.nan)
new_da

<xarray.DataArray (x: 3, y: 3)>
array([[ 1.,  2.,  3.],
       [ 4., nan,  6.],
       [ 7., nan,  9.]])
Coordinates:
  * x        (x) int64 1 2 3
  * y        (y) int64 1 2 3

new_da.interpolate_na(dim=('y'), method='linear')


<xarray.DataArray (x: 3, y: 3)>
array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])
Coordinates:
  * x        (x) int64 1 2 3
  * y        (y) int64 1 2 3

Since it's a 1-D interpolation, you can just interpolate for each dimension separately.

Upvotes: 1

Related Questions