Reputation: 780
What would be the best way to index a 2D xarray dataarray using "where" along a specific dimension using 1-D array? Here is an example:
da = xr.DataArray(
np.random.rand(4, 3),
[
("time", pd.date_range("2000-01-01", periods=4)),
("space", ["IA", "IL", "IN"]),
],)
>>> da
<xarray.DataArray (time: 4, space: 3)>
array([[0.26519114, 0.60342615, 0.49726218],
[0.02599198, 0.91702113, 0.7771629 ],
[0.1575904 , 0.25217269, 0.74094842],
[0.7581441 , 0.83447034, 0.31751737]])
and I have a 1-D array/list:
I = [1,0,1,1]
My goal is to get all the rows where I==1. What I do right now is something like this:
I2 =np.repeat(I,repeats=da.shape[1],axis=0).reshape(da.shape)
>>> da.where(I2==1)
<xarray.DataArray (time: 4, space: 3)>
array([[0.26519114, 0.60342615, 0.49726218],
[ nan, nan, nan],
[0.1575904 , 0.25217269, 0.74094842],
[0.7581441 , 0.83447034, 0.31751737]])
Is there another way to do this?
Upvotes: 1
Views: 876
Reputation: 2097
I'm a fan of the approach in @Maximilian's answer, but if you'd like to retain the mask, xarray's where
method will automatically broadcast DataArrays if you use those as an input:
In [4]: I = xr.DataArray([1, 0, 1, 1], dims=["time"])
In [5]: da.where(I == 1)
Out[5]:
<xarray.DataArray (time: 4, space: 3)>
array([[0.64729142, 0.19308236, 0.31638345],
[ nan, nan, nan],
[0.15063964, 0.53010035, 0.59722309],
[0.96166221, 0.14651066, 0.72306466]])
Coordinates:
* time (time) datetime64[ns] 2000-01-01 2000-01-02 2000-01-03 2000-01-04
* space (space) <U2 'IA' 'IL' 'IN'
Upvotes: 3
Reputation: 8450
Probably the easiest way is to use an bool indexer:
In [15]: I = [True, False, True, True]
In [17]: da.isel(time=I)
Out[17]:
<xarray.DataArray (time: 3, space: 3)>
array([[0.71844541, 0.59648881, 0.39432886],
[0.93043181, 0.86698011, 0.39920336],
[0.13478564, 0.29922154, 0.09583871]])
Coordinates:
* time (time) datetime64[ns] 2000-01-01 2000-01-03 2000-01-04
* space (space) <U2 'IA' 'IL' 'IN'
That doesn't quite get you the mask, but you could reindex_like
to get the original shape back.
Upvotes: 1