Reputation: 21
I'm not exactly sure how to phrase this, but an example makes this clear:
import xarray as xr
a = xr.DataArray(data=range(8), dims = ["measurement"], coords = {"measurement": range(8), "plant":("measurement",[0,0,0,0,1,1,1,1])})
b = xr.DataArray(data=[100, 200], dims = ["plant"], coords = {"plant": range(2)})
which gives
<xarray.DataArray (measurement: 8)>
array([0, 1, 2, 3, 4, 5, 6, 7])
Coordinates:
* measurement (measurement) int32 0 1 2 3 4 5 6 7
plant (measurement) int32 0 0 0 0 1 1 1 1
<xarray.DataArray (plant: 2)>
array([100, 200])
Coordinates:
* plant (plant) int32 0 1
I want to add the offsets per manufacturing plant from b
to the measurements in a
. But running a+b
gives me
<xarray.DataArray (measurement: 8, plant: 2)>
array([[100, 200],
[101, 201],
[102, 202],
[103, 203],
[104, 204],
[105, 205],
[106, 206],
[107, 207]])
Coordinates:
* measurement (measurement) int32 0 1 2 3 4 5 6 7
* plant (plant) int32 0 1
so it made all kind of extra data points.
I can make it in an ugly way:
def adder(x, y):
return x + y.sel(plant = x.plant.values)
a.groupby("measurement").map(lambda x:adder(x,b))
which gives the desired answer
<xarray.DataArray (measurement: 8)>
array([100, 101, 102, 103, 204, 205, 206, 207])
Coordinates:
* measurement (measurement) int32 0 1 2 3 4 5 6 7
plant (measurement) int32 0 0 0 0 1 1 1 1
How do I make this in a nice way?
Upvotes: 2
Views: 90
Reputation: 15442
Almost there!
Use xarray’s Advanced Indexing, selecting data using a DataArray instead of a numpy array. This reindexes the array to the dimensions of the indexer:
x + y.sel(plant = x.plant)
In this case, because x.plant
is indexed by measurement, y will be reshaped based on the plant indices, but the dimension of the reindexed array will be measurement
. Then, it can safely be added to the values of measurement without creating a new dimension, as the dims are aligned.
Upvotes: 1