titusjan
titusjan

Reputation: 5546

Sample 2D grid in Xarray

I have a 1D array of samples, each having a corresponding x and y coordinate. I want to transform this into a 2D grid where each grid cell contains the average of all samples falling in that grid cell. Of course I could program this by hand, but I've got the impression that this is possible with multidimensional grouping.

As an example data I make a Lissajous curve

enter image description here

I put this data in a DataArray and make a MultiIndex with x and y coordinates.

my_data = <xarray.DataArray 'my_data' (time: 1200)>
array([0.000e+00, 1.000e+00, 2.000e+00, ..., 1.197e+03, 1.198e+03,
       1.199e+03])
Coordinates:
    h        (time) float64 0.0 0.5 1.0 1.5 2.0 ... 598.0 598.5 599.0 599.5
  * time     (time) object MultiIndex
  * x        (time) float64 0.0 0.3596 0.6711 0.8929 ... 0.5044 0.7812 0.9535
  * y        (time) float64 1.0 0.9498 0.8041 0.5777 ... -0.6339 -0.36 -0.04993

The full example code is as follows:

import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

DIM_TIME = 'time'

t = np.arange(1200.0)

da = xr.DataArray(
    name='my_data',
    data = t, dims=[DIM_TIME],
    coords = {
        'x': (DIM_TIME, np.sin(t / np.e)),
        'y': (DIM_TIME, np.cos(t / np.pi)),
        'h': (DIM_TIME, t/2)})

da = da.set_xindex(['x', 'y'])  # Add multi index
print(f"\n{da.name} = {da}")

bins = [-1.0, -0.6, -0.2,  0.2, 0.6, 1.0]
binned_x = da.groupby_bins("x", bins).mean().rename("bin_x_avg")
print(f"\n{binned_x.name} = {binned_x}")

da.to_dataset().plot.scatter(x='x', y='y', hue='h')
plt.show()

# Raises IndexError: too many indices
binned_xy = da.groupby_bins(("y", "x"), (bins, bins)).mean() # Something like this.

I can group-by one dimension just fine (binned_x), it gives a 1D array with 5 elements.

bin_x_avg = <xarray.DataArray 'bin_x_avg' (x_bins: 5)>
array([603.20738636, 598.84431138, 600.03870968, 596.18823529,
       597.48876404])
Coordinates:
  * x_bins   (x_bins) object (-1.0, -0.6] (-0.6, -0.2] ... (0.2, 0.6] (0.6, 1.0]

I would like to do something similar that bins in two dimensions. It should return a 5 by 5 DataArray. Something like the last statement in my code (binned_xy).

Is this somehow possible in XArray?

Upvotes: 0

Views: 270

Answers (1)

jspaeth
jspaeth

Reputation: 335

You could use flox:

import flox.xarray


result_raw = flox.xarray.xarray_reduce(
    da,
    da.x,
    da.y,
    func="mean",
    expected_groups=(bins, bins),
    isbin=[True, True],
    method="map-reduce",
)
print(result_raw)

<xarray.DataArray 'my_data' (x_bins: 5, y_bins: 5)>
array([[602.05454545, 610.55769231, 597.79545455, 613.41666667,
        598.03061224],
       [612.52941176, 600.84210526, 562.61538462, 640.25      ,
        586.64705882],
       [582.6744186 , 614.19230769, 630.9375    , 591.19230769,
        602.63636364],
       [601.26923077, 569.52173913, 640.75      , 507.52631579,
        614.73076923],
       [604.28      , 615.91666667, 584.89130435, 593.90740741,
        590.16666667]])
Coordinates:
  * x_bins   (x_bins) object (-1.0, -0.6] (-0.6, -0.2] ... (0.2, 0.6] (0.6, 1.0]
  * y_bins   (y_bins) object (-1.0, -0.6] (-0.6, -0.2] ... (0.2, 0.6] (0.6, 1.0]

and if you want numeric coordinates:

x_bin_center = [b.mid for b in result.x_bins.values]
y_bin_center = [b.mid for b in result.y_bins.values]

result = result_raw.assign_coords(
    x_bin_center=("x_bins", x_bin_center), y_bin_center=("y_bins", y_bin_center)
).swap_dims(x_bins="x_bin_center", y_bins="y_bin_center")
print(result)

<xarray.DataArray 'my_data' (x_bin_center: 5, y_bin_center: 5)>
array([[602.05454545, 610.55769231, 597.79545455, 613.41666667,
        598.03061224],
       [612.52941176, 600.84210526, 562.61538462, 640.25      ,
        586.64705882],
       [582.6744186 , 614.19230769, 630.9375    , 591.19230769,
        602.63636364],
       [601.26923077, 569.52173913, 640.75      , 507.52631579,
        614.73076923],
       [604.28      , 615.91666667, 584.89130435, 593.90740741,
        590.16666667]])
Coordinates:
    x_bins        (x_bin_center) object (-1.0, -0.6] (-0.6, -0.2] ... (0.6, 1.0]
    y_bins        (y_bin_center) object (-1.0, -0.6] (-0.6, -0.2] ... (0.6, 1.0]
  * x_bin_center  (x_bin_center) float64 -0.8 -0.4 0.0 0.4 0.8
  * y_bin_center  (y_bin_center) float64 -0.8 -0.4 0.0 0.4 0.8

Upvotes: 1

Related Questions