Reputation: 1426
I'm not sure how to word this question but I hope this example can explain it.
I have a series of netcdf files per day of data. Each file contains a time dimension to the data which as a 30 day forecast.
If I read in a year's worth of data using:
data=xarray.open_mfdataset(files, concat_dim='None', autoclose='True')
Then I get:
Dimensions: (None: 365, lat: 110, lon: 100, time: 395)
I'm only interested in the value at the time = 0 for each file, i.e. for file = 0, I want time = 0 for file = 360, I want time = 360, etc.
Basically I think what I want to do is only read in the first element of the time component from each file but I can't seem to figure out how to do that with open_mfdataset.
Even just dropping the unwanted values after reading the whole thing in would be fine but I can't seem to figure that out either because of the way open_mfdataset concatenates the dataset.
Upvotes: 1
Views: 5647
Reputation: 6464
Using a preprocess function will allow you to do what you're after. The preprocess function is applied before concatenation so you can use that to reformat datasets during the open_mfdataset
step.
def preprocess(ds):
'''keep only the first timestep for each file'''
return ds.isel(time=0)
data = xr.open_mfdataset(files, preprocess=preprocess, concat_dim='time', ...)
Depending on how your files are formatted, you may have to further cleanup the datasets in preprocess
.
Upvotes: 5