System123
System123

Reputation: 541

Apply function along time dimension of XArray

I have an image stack stored in an XArray DataArray with dimensions time, x, y on which I'd like to apply a custom function along the time axis of each pixel such that the output is a single image of dimensions x,y.

I have tried: apply_ufunc but the function fails stating that I need to first load the data into RAM (i.e. cannot use a Dask Array). Ideally, I'd like to keep the DataArray as Dask Arrays internally as it isn't possible to load the entire stack into RAM. The exact error message is:

ValueError: apply_ufunc encountered a dask array on an argument, but handling for dask arrays has not been enabled. Either set the dask argument or load your data into memory first with .load() or .compute()

My code currently looks like this:

import numpy as np
import xarray as xr
import pandas as pd 

def special_mean(x, drop_min=False):
    s = np.sum(x)
    n = len(x)
    if drop_min:
    s = s - x.min()
    n -= 1
    return s/n

times = pd.date_range('2019-01-01', '2019-01-10', name='time')

data = xr.DataArray(np.random.rand(10, 8, 8), dims=["time", "y", "x"], coords={'time': times})
data = data.chunk({'time':10, 'x':1, 'y':1})

res = xr.apply_ufunc(special_mean, data, input_core_dims=[["time"]], kwargs={'drop_min': True})

If I do load the data into RAM using .compute then I still end up with an error which states:

ValueError: applied function returned data with unexpected number of dimensions: 0 vs 2, for dimensions ('y', 'x')

I'm not sure entirely what I am missing/doing wrong.

Upvotes: 8

Views: 10438

Answers (3)

Ali Fallah
Ali Fallah

Reputation: 1

Interestingly, I realized that, in a situation, to have the output of apply_ufunc 3D instead of 2D, we need to add "out_core_dims=[["time"]]" to the apply_ufunc.

Upvotes: 0

Ali Fallah
Ali Fallah

Reputation: 1

My aim was also to implement apply_ufunc from Xarray such that it can compute the special mean across x and y.

I enjoyed Ales example; of course by omitting the line related to the chunk. Otherwise:

ValueError: applied function returned data with unexpected number of dimensions. Received 0 dimension(s) but expected 2 dimensions with names: ('y', 'x')

Upvotes: 0

Ales
Ales

Reputation: 505

def special_mean(x, drop_min=False):
    s = np.sum(x)
    n = len(x)
    if drop_min:
        s = s - x.min()
    n -= 1
    return s/n

times = pd.date_range('2019-01-01', '2019-01-10', name='time')

data = xr.DataArray(np.random.rand(10, 8, 8), dims=["time", "y", "x"], coords={'time': times})
data = data.chunk({'time':10, 'x':1, 'y':1})

res = xr.apply_ufunc(special_mean, data, input_core_dims=[["time"]], kwargs={'drop_min': True}, dask = 'allowed', vectorize = True)

The code above using the vectorize argument should work.

Upvotes: 7

Related Questions