Reputation: 55
I have a large (>GB) file of daily global ocean surface temperatures. I've never really worked with netCDF files before, mainly much smaller pandas dataframes and CSVs. With some fiddling I was able to make a few world maps with the netCDF data, but I also started with monthly data so the amount of data is much greater with daily values. I'm using numpy and matplotlib right now, with Python on Windows. I tried xarray, but it was unable to allocate space... Are there any recommendations for software that can manipulate netCDFs? Or is there a way to 'ignore' the values that I don't need? I came across masking but I'm not sure if that would be helpful? Slicing?
For example, from this netCDF I would like to access only data from around the Hawaiian Islands, and for specifics time frames.
This is for oceanographic/climatological purposes.
Upvotes: 0
Views: 1176
Reputation: 265
If you only want smaller sections of the data set, I would recommend CDO. With it you can extract single regions, time slices and variables from your somewhat too large file.
For example, if you only want to have the variable tsurf (surface temperature) over Europe, you can use
cdo -selvar,tsurf -sellonlatbox,-44.5,64.5,22,72.5 infile.nc outfile.nc
to filter them out. (on command line)
Using xarray you can then (for example in jupyter) simply select certain timescales.
import sys, os
import cartopy as ccrs, xarray as xr, matplotlib.pyplot as plt
start_date = "1990-12-31"; end_date = "2020-12-31";
yourXRdataset = xr.open_dataset(os.path.join(PATH_TO_UR_FILE + "/yourfile.nc")
customTimescale = yourXRdataset.sel(time=slice(start_date, end_date))
If you now want to plot the yearly average of the temperature, this is easily done with
plt.figure(figsize=(20,8), dpi=216)
ax = plt.subplot(projection=ccrs.PlateCarree())
customTimescale['tsurf'].mean('time').plot.contourf(ax=ax, cmap="Spectral_r", levels=33)
Cartopy is very good for displaying geographic data in python.
Upvotes: 0
Reputation: 3397
You seem to be asking a lot of questions here.
First, you can use xarray to slice geographic data. Just read this guide and do some google searches and you should find a solution. Without knowing the netCDF grid it is not possible to provide a specific answer. Space should really not be an issue, as xarray can do things lazily. You can also use dask to work with multi-file datasets in xarray.
In terms of alternatives, you can use NCO. Geographic cropping would look something like this.
ncks -d lat,0.,90. infile.nc outfile.nc
If you can access Linux, you could also do this using CDO or my package nctoolkit in Python (which uses CDO as a backend). For nctoolkit, the commands would be something like this:
import nctoolkit as nc
ds = nc.open_data("infile.nc")
ds.crop(lon = [0,90], lat = [0,90])
ds.to_nc("outfile.nc")
Upvotes: 1