Reputation: 21961
Is there a way to create a netCDF file with time dimension beyond year 2263 using xarray?
Here is how a netCDF toy dataset can be created http://xarray.pydata.org/en/stable/time-series.html
However the time dimension has a type of pandas datetime index and those do not extend beyond 2263 as can be seen here: https://github.com/pandas-dev/pandas/issues/13346
Upvotes: 5
Views: 1584
Reputation: 3775
In the future, you might be able to do this by creating a date axis using cftime
objects, but as it currently stands there is an outstanding issue in xarray
that won't let you write netCDF files containing such objects.
However, the easiest and cleanest way to do this, even if you could save such objects, is still instead to manually define that axis as an array of integers with some units.
import numpy as np
import xarray as xr
days = np.asarray(range(100*365))
ds = xr.Dataset(
{'time': (['time'], days, {'units': 'days since 2200-01-01 0:0:0'})}
)
print(ds['time'][-1]
ds.to_netcdf('test.nc')
ds = xr.open_dataset('test.nc')
print(ds['time'][-1])
gives the output
<xarray.DataArray 'time' ()>
array(36499)
Coordinates:
time int64 36499
Attributes:
units: days since 2200-01-01 0:0:0
followed by
<xarray.DataArray 'time' ()>
array(datetime.datetime(2299, 12, 7, 0, 0), dtype=object)
Coordinates:
time object 2299-12-07
Notice that when you re-open the dataset, xarray will automatically decode it.
The 'units' attribute you use should follow the CF conventions for time coordinates. You can replace 'days' with 'hours', 'minutes', or 'seconds' as you need.
This does require you manually compute the integers needed, which is mainly difficult if your time axis is in years (since "year" is not a defined unit of measure of time, it varies in length depending on leap-years). If that's the case, you can use something like the following:
import cftime
# replace this to use a different calendar
Datetime = cftime.DatetimeProlepticGregorian
# make your list of Datetime objects
time_list = []
month = day = 1
hour = minute = second = 0
for year in range(2200, 2300, 1):
time_list.append(Datetime(year, month, day, hour, minute, second))
# this will convert them into a time axis, here in units of
# 'days since 2200-01-01 0:0:0'
seconds_in_day = 60*60*24
day_list = []
for dt in time_list:
time_since_2200 = dt - Datetime(2200, month, day, hour, minute, second)
day_list.append(int(time_since_2200.total_seconds() / seconds_in_day))
You can use a different cftime
class (such as cftime.DatetimeJulian
or cftime.DatetimeNoLeap
) to use a different calendar. This code should be modified to give the right time_list
for your use. You can also switch out seconds_in_day
for seconds in some other time unit (and also supply that unit to the xr.Dataset
call).
Upvotes: 2
Reputation: 3856
The problem might be that xarray optionally uses netcdftime for times outside datetime.datetime range, but pandas does not. So, something like this example won't work, even with netcdftime installed
import numpy as np
import pandas as pd
import xarray as xr
data = np.random.rand(4, 3)
locs = ['IA', 'IL', 'IN']
times = pd.date_range('2318-04-25', periods=4)
da = xr.DataArray(data, coords=[times, locs], dims=['time', 'space'])
This will fail when you try to create the pandas date_range. Even providing a netcdftime.datetime as the first argument of pd.date_range() doesn't work because pandas wants to convert to its own, limited datetime type.
Instead, you need to specify times directly to xarray. Unfortunately, this is where my knowledge of netcdf fails me, but I can give you the outlines and maybe you can get it from here.
There are many ways to specify dates in DataArray parameters. You need to create your own date range with the netcdftime.datetime type as its base. You can create a date index with netcdftime.date2index() and use that instead of the pandas DateIndex in the example above.
You should probably post your example code that shows the problem. I've assumed you're trying to create a DataArray, but maybe this is not the issue you're having.
Upvotes: 0