Anna Sofia
Anna Sofia

Reputation: 21

Substract 10 years from time variable in netCDF-file

I need to substract 10 years from the time variable (dimension) in a netCDF-file.

The original input file starts at 1969-06-20 00:00 and ends at 1971-07-01 00:00, with 3H time steps. The time unit in the original file is unix seconds since 1970-01-01, indicated as datetime64[ns] format in Python. I want to substract 10 years from every timestep so that the new file starts with 1959-06-20 00:00 and end at 1961-07-01 00:00.

I am loading the netCDF-file in Python with xarray and substracting 10 years by using np.timedelta64. Then I convert and save the modified dataset as a new netCDF-file, see my code below.

import xarray as xr 
import numpy as np

# load original file
ds = xr.open_dataset('original_file.nc')
ds = ds.load()

# substract 10 years from time dimension
ds['time'] = ds.time - np.timedelta64(3652, 'D')

# save modified dataset as new nc-file
ds.to_netcdf('new_file.nc')

I am experiencing two problems.

Firstly, since my file contains a leap year (1960) it is not sufficient to use 365.242 days/yr * 10 yrs = 3652 day, as a result the output is one day off and starts at 1959-06-21 00:00. But I get the following error message what I try to usenp.timedelta64 with the years 'Y' option:

UFuncTypeError: Cannot cast ufunc 'subtract' input 1 from dtype('<m8[Y]') to dtype('<m8[ns]') with casting rule 'same_kind'

Secondly, in the operation the time unit changes from unix seconds since 1970-01-01 for the original input netCDF-file to hours since [start date], which in my case becomes 1959-06-21 00:00, for the created output netCDF-file. I want the time unit to still be in seconds since 1970-01-01.

Does anyone have a suggestions or input to how I can solve this issue?

Thanks everyone

Upvotes: 0

Views: 300

Answers (1)

jhamman
jhamman

Reputation: 6464

For the main part of your question, you should convert the time coordinate to an index before performing the arithmetic and use Pandas.Timedelta

ds['time'] = ds.time.to_index() - pd.Timedelta(days=3652)

This should work for most datetime-like indexes in Xarray.

Finally, I suggest reading this section in the Xarray docs that describes how to control the encoding of time variables.

Upvotes: 0

Related Questions