Reputation: 57
I am trying to merge 100+ NDBC buoy netcdf datasets, where each file has an associated latitude and longitude, into one netcdf data set. When I use cdo
or ncrcat
I get a combined dataset but it only takes the latitude and longitude coordinates from the first station NetCDF file. Also, not sure if possible but the station name (five digits) is in the attributes of each station file and that is lost as well upon combining, whereas I would hope to carry on each individual station name somehow in the combined file.
Ideally, this is what I am wanting:
Here is one of the buoy NetCDF station datasets to see how it is structured: https://dods.ndbc.noaa.gov/thredds/fileServer/data/stdmet/41004/41004h9999.nc
Or, reading it through xarray produces:
<xarray.Dataset>
Dimensions: (latitude: 1, longitude: 1, time: 48)
Coordinates:
* time (time) datetime64[ns] 2021-04-01T00:50:00 ... 20...
* latitude (latitude) float32 31.4
* longitude (longitude) float32 -80.87
Data variables: (12/13)
wind_dir (time, latitude, longitude) float64 ...
wind_spd (time, latitude, longitude) float32 ...
gust (time, latitude, longitude) float32 ...
wave_height (time, latitude, longitude) float32 ...
dominant_wpd (time, latitude, longitude) timedelta64[ns] ...
average_wpd (time, latitude, longitude) timedelta64[ns] ...
... ...
air_pressure (time, latitude, longitude) float32 ...
air_temperature (time, latitude, longitude) float32 ...
sea_surface_temperature (time, latitude, longitude) float32 ...
dewpt_temperature (time, latitude, longitude) float32 ...
visibility (time, latitude, longitude) float32 ...
water_level (time, latitude, longitude) float32 ...
Attributes:
institution: NOAA National Data Buoy Center and Participators in Data As...
url: http://dods.ndbc.noaa.gov
quality: Automated QC checks with manual editing and comprehensive m...
conventions: COARDS
station: 41008
comment: GRAYS REEF - 40 NM Southeast of Savannah, GA
location: 31.400 N 80.866 W
I have tried converting to a pandas dataframe and writing to hdf5 file format but it is not easily manipulatable for much once the hdf5 is created. I also have not much experience working with hdf5 files compared to xarray and netcdf (was reusing a premade script which is why output was hdf5).
I've tried xarray.mf_dataset()
which works but resulted in a 4 GB+ file when it should be around 100 MB and I also still had the issue of not keeping station name attribute data. I would prefer for this to be done in python (having issues using cdo
and nco
in Python currently) but can also run these commands from bash without issues.
If any more info is needed, please let me know.
Upvotes: 0
Views: 455
Reputation: 6352
I suggest you try ncecat with group aggregation (gag), e.g.,
ncecat -7 --gag in*.nc out.nc
Followup to comment below:
As the referenced documentation says, this command places each input file in its entirety into its own group in the output file. You might think it "removed all of my data variables and values" if you did not examine the contents of the groups in the output, and just focused on the root level group (which contains only global metadata and subgroups). Use, e.g.,
ncks -m out.nc | more
to examine the subgroups:
zender@sastrugi:~/nco/data$ ncecat -O --gag 85.nc 86.nc 87.nc ~/foo.nc
zender@sastrugi:~/nco/data$ ncks -m -v lat ~/foo.nc | more
netcdf foo {
group: \85 {
dimensions:
lat = 2 ;
vrt_nbr = 2 ;
variables:
float lat(lat) ;
lat:long_name = "Latitude (typically midpoints)" ;
lat:units = "degrees_north" ;
lat:bounds = "lat_bnd" ;
float lat_bnd(lat,vrt_nbr) ;
lat_bnd:purpose = "Cell boundaries for lat coordinate" ;
} // group /85
group: \86 {
dimensions:
lat = 2 ;
vrt_nbr = 2 ;
variables:
float lat(lat) ;
lat:long_name = "Latitude (typically midpoints)" ;
lat:units = "degrees_north" ;
lat:bounds = "lat_bnd" ;
float lat_bnd(lat,vrt_nbr) ;
lat_bnd:purpose = "Cell boundaries for lat coordinate" ;
} // group /86
group: \87 {
dimensions:
lat = 2 ;
vrt_nbr = 2 ;
variables:
float lat(lat) ;
lat:long_name = "Latitude (typically midpoints)" ;
lat:units = "degrees_north" ;
lat:bounds = "lat_bnd" ;
float lat_bnd(lat,vrt_nbr) ;
lat_bnd:purpose = "Cell boundaries for lat coordinate" ;
} // group /87
} // group /
Upvotes: 2