Reputation: 559
When DataArrays are created with multiple coordinates for the same dimension, they do not automatically index their coordinates, i.e. the following does not work:
d = DataArray([0], coords={'coordA': ('dim', [0]), 'coordB': ('dim', [0])}, dims=['dim'])
d.sel(coordA=0) # ValueError: dimensions or multi-index levels ['coordA'] do not exist
This is because the MultiIndex * dim: [coordA, coordB]
is not created.
Is there a way to automatically create the MultiIndex on DataArray creation?
We can create the index after object creation, but this is extremely cumbersome when creating DataArrays in many places.
d = d.set_index(dim=['coordA', 'coordB'], append=True)
d.sel(coordA = 0) # works
Before xarray 0.13, it was possible to override the DataArray.__init__
method and set the index inplace, but inplace now raises an error.
class DataAssembly(DataArray):
def __init__(self, *args, **kwargs):
super(DataAssembly, self).__init__(*args, **kwargs)
self.set_index(dim=['coordA', 'coordB'], append=True, inplace=True) # no longer works since 0.13
Upvotes: 0
Views: 110
Reputation: 8510
I think you can get what you're looking for by passing in a MultiIndex to coords:
In [30]: idx = pd.MultiIndex.from_arrays([[0], [0]], names=['cA', 'cB'])
In [28]: d = xr.DataArray([0], dims=['dim'], coords=dict(dim=idx))
In [29]: d
Out[29]:
<xarray.DataArray (dim: 1)>
array([0])
Coordinates:
* dim (dim) MultiIndex
- cA (dim) int64 0
- cB (dim) int64 0
In [31]: d.sel(cA=0)
Out[31]:
<xarray.DataArray (cB: 1)>
array([0])
Coordinates:
* cB (cB) int64 0
The original approach doesn't work because it's not clear whether coordA
& coordB
should be two parts of a MultiIndex, or non-indexed coordinates.
Does that make sense? Any feedback for what could be better?
Upvotes: 1