Reputation: 12439
I have several netCDF
files which can be downloaded here, created by Coperinucs. There are four files, each file is about 1GB.
I read the file like so
import xarray as xr
dset = xr.open_dataset("~/.../ERA5land1.nc")
Which gives me
<xarray.Dataset>
Dimensions: (latitude: 61, longitude: 101, time: 87647)
Coordinates:
* latitude (latitude) float32 31.0 30.9 30.8 30.7 ... 25.3 25.2 25.1 25.0
* longitude (longitude) float32 79.0 79.1 79.2 79.3 ... 88.7 88.8 88.9 89.0
* time (time) datetime64[ns] 1981-01-01T01:00:00 ... 1990-12-31T23:00:00
Data variables:
t2m (time, latitude, longitude) float32 dask.array<shape=(87647, 61, 101), chunksize=(10, 61, 101)>
Attributes:
Conventions: CF-1.6
history: 2020-03-10 16:47:13 GMT by grib_to_netcdf-2.16.0: /opt/ecmw...
Calculating the mean should be straight forward according to the documentation
mean = dset.mean()
That causes the computer to freeze and finally crash. Trying to chunk
the data does not work neither.
dset = xr.open_dataset("~/.../ERA5land1.nc", chunks = {'time': 10})
mean = dset.mean()
That does not crash, but I get this
<xarray.Dataset>
Dimensions: ()
Data variables:
t2m float32 dask.array<shape=(), chunksize=()>
I wonder how I can calculate the min
, max
and mean
for each grid cell and store it in a new netCDF
file with the same specs.
Upvotes: 1
Views: 1040
Reputation: 3407
This can be solved using my package nctoolkit (available through pip: https://pypi.org/project/nctoolkit/, user guide: https://nctoolkit.readthedocs.io/en/latest/installing.html).
This uses CDO as a backend, so should be able to handle your data easily.
The code required would be very similar to what you have provided.
import nctoolkit as nc
dset = nc.open_data("~/.../ERA5land1.nc")
mean = dset.tmean()
If you then want an xarray array you would do this:
mean.to_xarray()
Upvotes: 1