Reputation: 73
A lot of monthly NetCDF files contains all months in many years (for example, from Jan1948 to Dec2018).
How to use Xarray to compute the seasonal average of each year conveniently?
There are examples using GroupBy
to calculate seasonal average, but it seems to group all the months spanning many years to 4 groups, which can't give the seasonal average of every year.
Upvotes: 3
Views: 3179
Reputation: 21
Bit late but I've been struggling with this recently and found a solution that worked for me and hopefully can help others. I found the following worked nicely for splitting monthly data into seasonal data:
seasonal = ds.resample(time='QS-DEC').mean('time')
You can then access each season from the first months index, i.e, DJF=12, MAM=3, JJA=6, SON=9. Also, worth noting that the season's year is taken from the first month. So the season of Dec23 - Feb24 will result in the 'year' 2023.
This does not calculate the weights (as another user has provided), but is a simple method for splitting the data into seasonal means for each year.
Upvotes: 2
Reputation: 2097
It sounds like you are looking for a resample
-type operation. Using the get_dpm
function from the documentation example you linked to, I think something like the following should work:
month_length = xr.DataArray(
get_dpm(ds.time.to_index(), calendar='standard'),
coords=[ds.time],
name='month_length'
)
result = ((ds * month_length).resample(time='QS-DEC').sum() /
month_length.resample(time='QS-DEC').sum())
Using 'QS-DEC'
frequency will split the data into consecutive three-month periods, anchored at December 1st.
If your data has missing values, you'll need to modify this weighted mean operation to account for that (i.e. we need to mask the month_length
before taking the sum in the denominator):
result = (ds * month_length).resample(time='QS-DEC').sum() /
month_length.where(ds.notnull()).resample(time='QS-DEC').sum())
Upvotes: 4