Reputation: 1027
Simple question: I don't only want the value of the maximum but also the coordinates of it in an xarray DataArray. How to do that?
I can, of course, write my own simple reduce function, but I wonder if there is anything built-in in xarray?
Upvotes: 28
Views: 16066
Reputation: 1157
If you want the coordinates of the single maximum value in an xarray da
then I don't think the accepted answer is correct. Furthermore, if there is naturally a NaN
value in the array, the answers above the drop=True
will remove those NaN
values and give incorrect coordinates. This code worked well for me:
where = np.where(da==da.max())
Which will give, for example,
(array([0]), array([25]), array([598]), array([25]))
Showing the maximum value is at 0,25,598,25
. And it could also be accessed with da[where]
.
Upvotes: 1
Reputation: 41
This will return the coordinate points of the max value in a xarray dataarray.
max = xarraydata.where(xarraydata==xarraydata.max(), drop=True).squeeze()
Upvotes: 1
Reputation: 8480
Update:
xarray now has the idxmax
method for selecting the coords of the max values along one dimension:
In [8]: da = xr.DataArray(
...: np.random.rand(2,3),
...: dims=list('ab'),
...: coords=dict(a=list('xy'), b=list('ijk'))
...: )
In [14]: da
Out[14]:
<xarray.DataArray (a: 2, b: 3)>
array([[0.63059257, 0.00155463, 0.60763418],
[0.19680788, 0.43953352, 0.05602777]])
Coordinates:
* a (a) <U1 'x' 'y'
* b (b) <U1 'i' 'j' 'k'
In [13]: da.idxmax('a')
Out[13]:
<xarray.DataArray 'a' (b: 3)>
array(['x', 'y', 'x'], dtype=object)
Coordinates:
* b (b) <U1 'i' 'j' 'k'
The below answer is still relevant for the maximum over multiple dimensions, though.
You can use da.where()
to filter based on the max value:
In [17]: da = xr.DataArray(
np.random.rand(2,3),
dims=list('ab'),
coords=dict(a=list('xy'), b=list('ijk'))
)
In [18]: da.where(da==da.max(), drop=True).squeeze()
Out[18]:
<xarray.DataArray ()>
array(0.96213673)
Coordinates:
a <U1 'x'
b <U1 'j'
Edit: updated the example to show the indexes more clearly, now that xarray doesn't have default indexes
Upvotes: 37
Reputation: 31
You can also use stack :
Let's say data is a 3d variable with time, longitude, latitude and you want the coordinate of the maximum through time.
stackdata = data.stack(z=('lon', 'lat'))
maxi = stackdata.argmax(axis=1)
maxipos = stackdata['z'][maxi]
lonmax = [maxipos.values[itr][0] for itr in range(ntime)]
latmax = [maxipos.values[itr][1] for itr in range(ntime)]
Upvotes: 3
Reputation: 9603
An idxmax()
method would be very welcome in xarray, but nobody has gotten around to implementing it yet.
For now, you can find the coordinates of the maximum by combining argmax
and isel
:
>>> array = xarray.DataArray(
... [[1, 2, 3], [3, 2, 1]],
... dims=['x', 'y'],
... coords={'x': [1, 2], 'y': ['a', 'b', 'c']})
>>> array
<xarray.DataArray (x: 2, y: 3)>
array([[1, 2, 3],
[3, 2, 1]])
Coordinates:
* x (x) int64 1 2
* y (y) <U1 'a' 'b' 'c'
>>> array.isel(y=array.argmax('y'))
<xarray.DataArray (x: 2)>
array([3, 3])
Coordinates:
* x (x) int64 1 2
y (x) <U1 'c' 'a'
This is probably what .max()
should do in every case! Unfortunately we're not quite there yet.
The problem is that it doesn't yet generalize to the maximum over multiple dimensions in the way we would like:
>>> array.argmax() # what??
<xarray.DataArray ()>
array(2)
The problem is that it's automatically flattening, like np.argmax
. Instead, we probably want something like an array of tuples or a tuple of arrays, indicating the original integer coordinates for the maximum. Contributions for this would also be welcome -- see this issue for more details.
Upvotes: 7