Reputation: 89
I have 1 NetCDF file for the month of September 2007. It contains 6 hourly data for certain lat/long with wind and humidity variables. Each variable is in a shape of (120, 45, 93)
: 120 times (4 times a day), 45 latitudes and 93 longitudes. With the following code, I am able to get daily average data for all variables. Now, each variable is of shape (30, 45, 93)
. Time is an integer and has a unit of 'hours since 1900-01-01 00:00:00.0'
.
From this daily averaged data, how can I split into 30 different NetCDF files for each day, with the file name containing YYYY:MM:DD
time format?
import xarray as xr
monthly_data = xr.open_dataset('interim_2007-09-01to2007-09-31.nc')
daily_data = monthly_data.resample(time='1D').mean()
Upvotes: 3
Views: 3170
Reputation: 8087
Just in case it helps anyone, it is also possible to perform this task of calculating the daily mean and dividing into separate daily files directly from the command line:
cdo splitday -daymean in.nc day
which produces a series of files day01.nc day02.nc ...
Upvotes: 1
Reputation: 6434
Xarray has a top level function for times like this - xarray.save_mfdataset
. In your case, you would want to use groupby
to break your dataset into logical chunks and then create a list of corresponding file names. From there, just let save_mfdataset
do the rest.
dates, datasets = zip(*ds.resample(time='1D').mean('time').groupby('time'))
filenames = [pd.to_datetime(date).strftime('%Y.%m.%d') + '.nc' for date in dates]
xr.save_mfdataset(datasets, filenames)
Upvotes: 6
Reputation: 1271
After going through the documentation, you can use NetCDF4's num2date
to convert an integer to a date.
Also you can index xarray.dataset
using isel()
:
from netCDF4 import num2date
for i in range(30):
day = daily_data.isel(time=i)
the_date = num2date(day.time.data, units='hours since 1900-01-01 00:00:00')
day.to_netcdf(str(the_date.date())+'.nc', format='NETCDF4')
Upvotes: 2