Reputation: 21
I am working on analyzing atmospheric data, and I need to take averages for a parameter at a location over a decade. I have data from 1950-2020, and need to take meant for 1950-1959, 1960-1969, ... etc. I have gotten as far as using ds_annual_means = ds.groupby('time.year').mean() to get the annual averages, but it doesn't seem like there is a larger group than year....
I have also tried grouping by bins, but this doesn't seem to produce what I am looking for, and since it changes the time parameter to (obj) instead of (datetime64[ns]), I can't save it as an .nc file which is my ultimate goal.
Any advice would be greatly appreciated!
Upvotes: 1
Views: 439
Reputation: 481
Simply create a new variable which specifies the decade, then group by that one:
year = ds["time"].dt.year
decade = ((year - year[0]) / 10).astype(int)
Of course, take care whether that first year is the right to start counting from.
Next, assign new coordinate to your dataset, and groupby as you would with dt.year
:
ds = ds.assign_coords({'decade':decade})
result = ds["atmospheric_data"].groupby("decade").mean("time")
Upvotes: 4
Reputation: 798
Naively, I would try to create my own "group by decade" function.
Something like:
start = 1950
end = 2020
accumulator = []
for year in range(start, end, 10)
decade_mask = ds.time.dt.year.isin(range(year, year+10)
decade_mean = ds.sel(time=decade_mask)).mean()
accumulator.append(decade_mean)
result = xarray.concat(accumulator, dim="time")
I'm not sure this is the best way to do this though.
Upvotes: 3