Reputation: 323
I have an xarray with multiple coordinates along a single dimension. In the example below, coords a
and b
are defined along dimension dim1
. How would I groupby
using two coordinates that are defined along the same dimension(s)? Unlike this question, I am not trying to group along different dimensions, but a single one.
import xarray as xr
d = xr.DataArray([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]],
coords={
'a': ('dim1',['A', 'A', 'B', 'B']),
'b': ('dim1',['1', '2', '1', '2']),
'c': ('dim2',['x', 'y', 'z'])
},
dims=['dim1', 'dim2'])
d.groupby(['a','b']) # this gives: TypeError: `group` must be an xarray.DataArray or the name of an xarray variable or dimension
Upvotes: 4
Views: 3328
Reputation: 68
This is my current workaround:
import numpy as np
import xarray as xr
def groupby_multicoords(da, fields):
common_dim = da.coords[fields[0]].dims[0]
tups_arr = np.empty(len(da[common_dim]), dtype=object)
tups_arr[:] = list(zip(*(da[f].values for f in fields)))
return da.assign_coords(grouping_zip=xr.DataArray(tups_arr, dims=common_dim)).groupby('grouping_zip')
and then, groupby_multicoords(da=d, fields=['a', 'b'])
However, after grouping I am still left with the 'grouping_zip' coord. I would be grateful to replace it with d.groupby(['a','b'])
..
Upvotes: 3
Reputation: 8450
You can stack them into a single MultiIndex with .stack(new=[“dim1”,”dim2”)
, and then groupby that dimension.
Upvotes: 1