Martin
Martin

Reputation: 11

How to count the sum of at least two consecutive days with same the values, from xarray dataset in python

This is my first question here, so if you need something more specific, please let me know.

I have Xarray dataset with 3 dimensions (time, lon, lat) and heat variable. The heat is the values 1 and 0 (1 = more than 30degC, 0 = less than 30degC), and I want to calculate the number of events with at least three consecutive days with value 1 (time is in daily step)

And create the xarray dataset with only one-time value (one day) where each pixel will contain the number of events with at least three consecutive days with value 1.

time 155 = all July days for years 2026-2030

xarray.DataArray 'heat' time: 155, rlat: 6, rlon: 4

Do you have some ideas on how to do this?

here you can find the netcdf data EC7_heatjulZA26_30.nc

tm = 'C:/Path_to_NC/EC7_heatjulZA26_30.nc'
dmax = xr.open_dataset(tm, decode_coords="all")

da_max = dmax['tasmax']
da_max

*xarray.DataArray'tasmax'time: 155, rlat: 6, rlon: 4*

I sum (the 1 and 0) the data to one 6 x 4 raster, but I want to sum only those which are consecutively behind each other for at least 3 days.

HW=np.sum(da_max,axis = 0)
HW

array([[24, 20, 14, 11],
       [26, 21, 15, 11],
       [23, 23, 15, 12],
       [14, 15, 14, 11],
       [12, 12, 14, 11],
       [14, 12, 15, 13]])

Upvotes: 1

Views: 413

Answers (1)

msi_gerva
msi_gerva

Reputation: 2078

I can propose following solution:

#!/usr/bin/env ipython
# ---------------------
import numpy as np
import xarray as xr
from pylab import pcolormesh,show,colorbar,plot,title,legend,subplot,savefig
# -------------------
fin = 'EC7_heatjulZA26_30.nc' # filename in...

dfin = xr.open_dataset(fin) # let us use xarray to read data ...
vin = dfin['tasmax'].values # get only the values ...
ntime,ny,nx = np.shape(vin) # get the dimensions...
# --------------------------------------
dvin = np.diff(vin,axis=0) # let us take the diff along time axis...
# --------------------------------------
counts = np.sum(vin,axis=0) # let us count days with temp over threshold...
pcolormesh(counts);colorbar();title('Number of days over threshold');show() # let us view the map...
# --------------------------------------
dvin[dvin<0] = 0.e0; # let us remove the -1 when the temperature has dropped below the treshold...
icounts = np.sum(dvin,axis=0)
pcolormesh(icounts);colorbar();title('Number of instances over threshold (simple)');savefig('simple.png',bbox_inches='tight');show() # let us view the map...
# let us check:
plot(vin[:,1,0]);title('Number of instances found '+str(icounts[1,0]));show() # if you check, the number of instances is too high -- 9 instead of 6
# ---------------------------------------
# let us calculate correct one:
ntime,ny,nx = np.shape(vin) # get the dimensions...
dvin_org = np.diff(vin,axis=0); # the diff...

dvin = np.concatenate((np.zeros((1,ny,nx)),dvin_org),axis=0); # make the diff and original data same size...
dvin_n = np.concatenate((np.zeros((2,ny,nx)),dvin_org[:-1,:,:]),axis=0); # shift the diff +1 index in time
# ------------------------------------------------------------------------------------------------------
dvin[np.where((dvin_n==1)&(vin==1))] = 1.0 # in original diff, add one to the location, where the derivate does not change -- these are the time moments we actually wanted...
# -------------------------------------------------------
icounts = np.sum(dvin,axis=0) # calculate the correct number of instances over treshold
pcolormesh(icounts);colorbar();title('Number of instances over threshold (complex)');savefig('corrected.png',bbox_inches='tight');show() # let us view the map...
# let us check:
plot(vin[:,2,2]);title('Number of instances found '+str(icounts[2,2]));show()

So, the original calculation gives figure like this:enter image description here

where the occurrences goes up to 9 times in one grid cell. But, this is overestimated as the time-series look like this (9 is taken from the simple solution):

enter image description here

Problem is that we counted also couple of events where the temperature went over threshold only for one day. Therefore, made additional check with using the shifted derivate, see the code up.

In any case, the final solution comes then like this:

enter image description here

Hope this helps!

Upvotes: 0

Related Questions