Reputation: 139
I have multiple 2d xarray.DataArray
which do not (necessarily) share common coordinate values. Imagine these DataArray's having come from slicing out multiple (non-overlapping) bounding boxes in a larger gridded dataset.
I'd like to combine the arrays along a new axis fid
(i.e. a string identifier for the bounding box used to slice each array) without the original 2d coordinates "expanding" and filling with nan
.
e.g.
import xarray as xr
import numpy as np
# create some toy gridded data
nx = 9
ny = 9
data = np.random.randint(5, size=(nx, ny))
x_coord = np.linspace(0, 1, nx)
y_coord = np.linspace(0, 1, ny)
da = xr.DataArray(
data,
dims=("x_coord", "y_coord"),
coords={"x_coord": x_coord, "y_coord": y_coord}
)
# slice out two subsets of the gridded data
a = da.isel(x_coord=[1, 2, 3], y_coord=[2, 3, 4]).expand_dims(fid=["abc123"])
b = da.isel(x_coord=[6, 7, 8], y_coord=[5, 6, 7]).expand_dims(fid=["def456"])
>>> a
<xarray.DataArray (fid: 1, x_coord: 3, y_coord: 3)>
array([[[3, 4, 3],
[3, 1, 0],
[3, 2, 4]]])
Coordinates:
* fid (fid) object 'abc123'
* x_coord (x_coord) float64 0.125 0.25 0.375
* y_coord (y_coord) float64 0.25 0.375 0.5
>>> b
<xarray.DataArray (fid: 1, x_coord: 3, y_coord: 3)>
array([[[4, 3, 0],
[3, 2, 2],
[4, 2, 1]]])
Coordinates:
* fid (fid) object 'def456'
* x_coord (x_coord) float64 0.75 0.875 1.0
* y_coord (y_coord) float64 0.625 0.75 0.875
If I naively try and concatenate these along fid
dimension, the x_coord
and y_coord
expand to encompass all coordinate values from both sources, resulting in a (1, 6, 6) shaped array that is filled with nan
in most places:
>>> xr.concat([a, b], dim="fid")
<xarray.DataArray (fid: 2, x_coord: 6, y_coord: 6)>
array([[[ 3., 4., 3., nan, nan, nan],
[ 3., 1., 0., nan, nan, nan],
[ 3., 2., 4., nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan]],
[[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, nan, nan, nan],
[nan, nan, nan, 4., 3., 0.],
[nan, nan, nan, 3., 2., 2.],
[nan, nan, nan, 4., 2., 1.]]])
Coordinates:
* fid (fid) object 'abc123' 'def456'
* x_coord (x_coord) float64 0.125 0.25 0.375 0.75 0.875 1.0
* y_coord (y_coord) float64 0.25 0.375 0.5 0.625 0.75 0.875
I want the resulting array to remain (2, 3, 3) shaped. My idea is to pre-process each individual array to use a "local" coordinates system (xi, yi)
, e.g.
a_ = a.assign_coords(x_coord=a.x_coord-a.x_coord.min(), y_coord=a.y_coord-a.y_coord.min())
b_ = b.assign_coords(x_coord=b.x_coord-b.x_coord.min(), y_coord=b.y_coord-b.y_coord.min())
>>> xr.concat([a_, b_], dim="fid").rename(x_coord="xi", y_coord="yi")
<xarray.DataArray (fid: 2, xi: 3, yi: 3)>
array([[[3, 4, 3],
[3, 1, 0],
[3, 2, 4]],
[[4, 3, 0],
[3, 2, 2],
[4, 2, 1]]])
Coordinates:
* fid (fid) object 'abc123' 'def456'
* xi (xi) float64 0.0 0.125 0.25
* yi (yi) float64 0.0 0.125 0.25
... but I also want to keep the "original" coordinate system for each array. I imagine this will involve creating a multidimensional coordinate such that the resulting array has coordinates that go something like:
Coordinates:
* fid (fid) object
array(['abc123', 'def456'])
* xi (xi) float64
array([0.0, 0.125, 0.25])
* yi (yi) float64
array([0.0, 0.125, 0.25])
x_coord (fid, xi) float64
array([[0.125 0.25 0.375], [0.75 0.875 1.0]])
y_coord (fid, yi) float64
array([[0.25 0.375 0.5], [0.625 0.75 0.875]])
I'm just not quite sure of a neat way to go about doing this!
Upvotes: 2
Views: 145
Reputation: 15452
Almost there! I’d take the same approach in combination with xr.DataArray.swap_dims
:
a_ = a.assign_coords(
xi=(a.x_coord - a.x_coord.min()),
yi=(a.y_coord - a.y_coord.min()),
).swap_dims({
"x_coord": "xi",
"y_coord": "yi",
})
b_ = b.assign_coords(
xi=(b.x_coord - b.x_coord.min()),
yi=(b.y_coord - b.y_coord.min()),
).swap_dims({
"x_coord": "xi",
"y_coord": "yi",
})
result = xr.concat([a_, b_], dim="fid")
This will preserve the original coordinates while indexing by xi, yi. Note that you will not be able to use the original label values to slice or select, e.g. with .sel
. Instead, you’ll need to use your new dim labels (fid, xi, yi)
.
Upvotes: 1