Reputation: 1
I am looking at output from an ocean model, and from the output I would like to create a grid (x,y equivalent to lat,lon) of temperatures at the bottom of the water column, i.e., the deepest grid cells. Within the xarray dataset I have the maximum depths (see below, "Depth").
I can do this with a very slow loop, but was wondering if there was a way to avoid a loop, or at least part of the loop.
Here's what the data a code look like so far:
# load data as xarray
data_dir = 'run04'
ds1 = open_mdsdataset(data_dir,iters=np.arange(0,10001,5000),prefix=['U','V','W','S','T','Eta'])
ds1 = ds1.rename({'T':'Tt'}) # T doesn't work because it thinks its transpose
What ds1 looks:
<xarray.Dataset>
Dimensions: (XC: 40, XG: 40, YC: 30, YG: 30, Z: 100, Zl: 100, Zp1: 101, Zu: 100, time: 3)
Coordinates:
* XC (XC) >f4 2500.0 7500.0 12500.0 17500.0 22500.0 27500.0 32500.0 ...
* YC (YC) >f4 2500.0 7500.0 12500.0 17500.0 22500.0 27500.0 32500.0 ...
* XG (XG) >f4 0.0 5000.0 10000.0 15000.0 20000.0 25000.0 30000.0 ...
* YG (YG) >f4 0.0 5000.0 10000.0 15000.0 20000.0 25000.0 30000.0 ...
* Z (Z) >f4 -7.0 -21.0 -35.0 -49.0 -63.0 -77.0 -91.0 -105.0 -119.0 ...
* Zp1 (Zp1) >f4 0.0 -14.0 -28.0 -42.0 -56.0 -70.0 -84.0 -98.0 -112.0 ...
* Zu (Zu) >f4 -14.0 -28.0 -42.0 -56.0 -70.0 -84.0 -98.0 -112.0 ...
* Zl (Zl) >f4 0.0 -14.0 -28.0 -42.0 -56.0 -70.0 -84.0 -98.0 -112.0 ...
rA (YC, XC) >f4 dask.array<shape=(30, 40), chunksize=(30, 40)>
dxG (YG, XC) >f4 dask.array<shape=(30, 40), chunksize=(30, 40)>
dyG (YC, XG) >f4 dask.array<shape=(30, 40), chunksize=(30, 40)>
Depth (YC, XC) >f4 dask.array<shape=(30, 40), chunksize=(30, 40)>
rAz (YG, XG) >f4 dask.array<shape=(30, 40), chunksize=(30, 40)>
dxC (YC, XG) >f4 dask.array<shape=(30, 40), chunksize=(30, 40)>
dyC (YG, XC) >f4 dask.array<shape=(30, 40), chunksize=(30, 40)>
rAw (YC, XG) >f4 dask.array<shape=(30, 40), chunksize=(30, 40)>
rAs (YG, XC) >f4 dask.array<shape=(30, 40), chunksize=(30, 40)>
drC (Zp1) >f4 dask.array<shape=(101,), chunksize=(101,)>
drF (Z) >f4 dask.array<shape=(100,), chunksize=(100,)>
PHrefC (Z) >f4 dask.array<shape=(100,), chunksize=(100,)>
PHrefF (Zp1) >f4 dask.array<shape=(101,), chunksize=(101,)>
hFacC (Z, YC, XC) >f4 dask.array<shape=(100, 30, 40), chunksize=(100, 30, 40)>
hFacW (Z, YC, XG) >f4 dask.array<shape=(100, 30, 40), chunksize=(100, 30, 40)>
hFacS (Z, YG, XC) >f4 dask.array<shape=(100, 30, 40), chunksize=(100, 30, 40)>
iter (time) int64 dask.array<shape=(3,), chunksize=(1,)>
* time (time) int64 0 5000 10000
Data variables:
Eta (time, YC, XC) float32 dask.array<shape=(3, 30, 40), chunksize=(1, 30, 40)>
V (time, Z, YG, XC) float32 dask.array<shape=(3, 100, 30, 40), chunksize=(1, 100, 30, 40)>
W (time, Zl, YC, XC) float32 dask.array<shape=(3, 100, 30, 40), chunksize=(1, 100, 30, 40)>
S (time, Z, YC, XC) float32 dask.array<shape=(3, 100, 30, 40), chunksize=(1, 100, 30, 40)>
U (time, Z, YC, XG) float32 dask.array<shape=(3, 100, 30, 40), chunksize=(1, 100, 30, 40)>
Tt (time, Z, YC, XC) float32 dask.array<shape=(3, 100, 30, 40), chunksize=(1, 100, 30, 40)>
And the loop to get temperature values at the deepest cells:
# find the deepest wet cell at each gridpoint
# loop through timesteps
t_at_bottom1 = np.zeros((ds1.time.size,ds1.YC.size,ds1.XC.size))
for ti in np.arange(0,ds1.time.size,1):
# loop through x,y indices
for yi in np.arange(0,ds1.YC.size,1):
for xi in np.arange(0,ds1.XC.size,1):
# look for the grid cell closest to the bottom
t_at_bottom1[ti,yi,xi] = ds1.Tt.sel(time=ds1.time[ti],Z=-ds1.Depth.values[yi,xi],YC=ds1.YC[yi],XC=ds1.XC[xi],method='nearest')
Thanks for your help.
Upvotes: 0
Views: 1482
Reputation: 1683
Have a look at vectorized indexing.
Simple example in 2D:
import xarray as xr
import numpy as np
data = np.arange(12).reshape(3, 4)
da = xr.DataArray(data, dims=['depth', 'x'],
coords=dict(depth=[0, 1, 2], x=[0, 1, 2, 3]))
da
looks like
<xarray.DataArray (depth: 3, x: 4)>
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Coordinates:
* depth (depth) int64 0 1 2
* x (x) int64 0 1 2 3
Now x-depended depths can be chosen by index with another dataarray:
sel = xr.DataArray([0, 1, 0, 2], dims=['x'])
da[sel]
This returns
<xarray.DataArray (x: 4)>
array([ 0, 5, 2, 11])
Coordinates:
depth (x) int64 0 1 0 2
* x (x) int64 0 1 2 3
This requires xarray version 0.10.0 or later.
Upvotes: 3