Ress
Ress

Reputation: 780

best way to use xarray where along specific axis?

What would be the best way to index a 2D xarray dataarray using "where" along a specific dimension using 1-D array? Here is an example:

    da = xr.DataArray(
    np.random.rand(4, 3),
    [
        ("time", pd.date_range("2000-01-01", periods=4)),
        ("space", ["IA", "IL", "IN"]),
    ],)
>>> da
<xarray.DataArray (time: 4, space: 3)>
array([[0.26519114, 0.60342615, 0.49726218],
       [0.02599198, 0.91702113, 0.7771629 ],
       [0.1575904 , 0.25217269, 0.74094842],
       [0.7581441 , 0.83447034, 0.31751737]])

and I have a 1-D array/list:

 I = [1,0,1,1] 

My goal is to get all the rows where I==1. What I do right now is something like this:

I2  =np.repeat(I,repeats=da.shape[1],axis=0).reshape(da.shape)

>>> da.where(I2==1)
<xarray.DataArray (time: 4, space: 3)>
array([[0.26519114, 0.60342615, 0.49726218],
       [       nan,        nan,        nan],
       [0.1575904 , 0.25217269, 0.74094842],
       [0.7581441 , 0.83447034, 0.31751737]])

Is there another way to do this?

Upvotes: 1

Views: 876

Answers (2)

spencerkclark
spencerkclark

Reputation: 2097

I'm a fan of the approach in @Maximilian's answer, but if you'd like to retain the mask, xarray's where method will automatically broadcast DataArrays if you use those as an input:

In [4]: I = xr.DataArray([1, 0, 1, 1], dims=["time"])

In [5]: da.where(I == 1)
Out[5]:
<xarray.DataArray (time: 4, space: 3)>
array([[0.64729142, 0.19308236, 0.31638345],
       [       nan,        nan,        nan],
       [0.15063964, 0.53010035, 0.59722309],
       [0.96166221, 0.14651066, 0.72306466]])
Coordinates:
  * time     (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
  * space    (space) <U2 'IA' 'IL' 'IN'

Upvotes: 3

Maximilian
Maximilian

Reputation: 8450

Probably the easiest way is to use an bool indexer:

In [15]: I = [True, False, True, True]


In [17]: da.isel(time=I)
Out[17]:
<xarray.DataArray (time: 3, space: 3)>
array([[0.71844541, 0.59648881, 0.39432886],
       [0.93043181, 0.86698011, 0.39920336],
       [0.13478564, 0.29922154, 0.09583871]])
Coordinates:
  * time     (time) datetime64[ns] 2000-01-01 2000-01-03 2000-01-04
  * space    (space) <U2 'IA' 'IL' 'IN'

That doesn't quite get you the mask, but you could reindex_like to get the original shape back.

Upvotes: 1

Related Questions