Reputation: 1800
I noticed that while concatenating multiple yearly NetCDF files into one file or while splitting a time-series file into yearly groups, xarray's .to_netcdf() automatically updates the time units. Example of what I mean
# time attribute of the file
ncdump -h file_1970_2017.nc
>>double time(time) ;
time:_FillValue = NaN ;
time:units = "Hours since 1900-01-01T00:00:00+00:00" ;
time:calendar = "proleptic_gregorian" ;
# after splitting the files into yearly files using group-by method the time attribute is automatically modified
# example
ncdump -h file_splitted_2005.nc
>>double time(time) ;
time:_FillValue = NaN ;
time:units = "Hours since 2005-01-01T00:00:00+00:00" ;
time:calendar = "proleptic_gregorian" ;
The same problem is encountered when I do the vice-versa, that is when I concatenate individual yearly files into a common file. Is there some way in which I can force it to not change the time attribute? From the documentation, it seems 'encoding' argument might help but I couldn't figure out how?
Upvotes: 1
Views: 912
Reputation: 1800
Figured it out. Using encoding argument as a nested dictionary this could be achieved
# when writing out the dataset ds encoding can be used as
ds.to_netcdf('file_splitted_2005.nc', encoding={'my_variable':{'_FillValue': -999.0},'time':{'units': "seconds since 1900-01-01 00:00:00"}})
If I understood correctly, while writing out the data, xarray will automatically re-calculate the time values based on what units attribute we specify if our time array is a datetime object. In that case it uses the intelligent datetime features under the hood. That means I could also specify
'time':{'units': "seconds since 2000-01-01 00:00:00"}
and it will automatically re-calculate the values it stores in the time array making life much easier for us.
Upvotes: 4