djhoese
djhoese

Reputation: 3667

Cleanest way to support xarray, dask, and numpy arrays in one function

I have a function that accepts multiple 2D arrays and creates two new arrays with the same shape. It was originally written to only support numpy arrays, but was "hacked" to support dask arrays if a "chunks" attribute was seen. A user who was using xarray DataArrays pointed out that this function now returns dask arrays because DataArray's have a "chunks" attribute.

I'm wondering if the dask/xarray experts can tell me what the cleanest way to support all 3 (4?) object types might be without having to duplicate code for each type (numpy array, dask array, xarray with numpy, xarray with dask). Keep in mind the inputs are 2D arrays so the masking operations involved aren't supported out of the box. The related pull request for fixing this is here. Here is what we have so far while trying to avoid adding xarray and dask as required dependencies:

if hasattr(az_, 'chunks') and not hasattr(az_, 'loc'):
    # dask array, but not xarray
    import dask.array as da
    az_ = da.where(top_s > 0, az_ + np.pi, az_)
    az_ = da.where(az_ < 0, az_ + 2 * np.pi, az_)
elif hasattr(az_, 'loc'):
    # xarray
    az_.data[top_s > 0] += np.pi
    az_.data[az_.data < 0] += 2 * np.pi
else:
    az_[top_s > 0] += np.pi
    az_[az_ < 0] += 2 * np.pi

Edit: Is there an attribute that is semi-unique to xarray objects?

Upvotes: 3

Views: 901

Answers (2)

Guy Fleegman
Guy Fleegman

Reputation: 27

I'm a little late to the party here, but if this something you're doing a lot then you might consider a function decorator that will coerce your input array down to an ndarray (or whatever the case may be), run the wrapped function, and maybe even rewrap the result to match the input type before returning it. It's something I've played around with a couple of times, but I kept deciding that I'd rather be able to leverage and support xarray objects when possible. I spent some time looking at xr-scipy when I first started playing with xarray. You might find some patterns in there that would be generic enough (or could easily be made so) while adding a little extra something for xarray objects when appropriate.

Upvotes: 1

Keisuke FUJII
Keisuke FUJII

Reputation: 1406

OK. You may want to avoid unnecessary dependence. I often define has_dask variable

try:
    import dask.array as da
    has_dask = True
except ImportError:
    has_dask = False

and then

if has_dask and isinstance(az_, da.Array):
    --- do some thing ---
else
    --- do some other thing ----

Upvotes: 1

Related Questions