Jake
Jake

Reputation: 57

Using CDO or NCO to combine buoy station datasets (NetCDF), lat and lon gets dropped

I am trying to merge 100+ NDBC buoy netcdf datasets, where each file has an associated latitude and longitude, into one netcdf data set. When I use cdo or ncrcat I get a combined dataset but it only takes the latitude and longitude coordinates from the first station NetCDF file. Also, not sure if possible but the station name (five digits) is in the attributes of each station file and that is lost as well upon combining, whereas I would hope to carry on each individual station name somehow in the combined file.

Ideally, this is what I am wanting:

Here is one of the buoy NetCDF station datasets to see how it is structured: https://dods.ndbc.noaa.gov/thredds/fileServer/data/stdmet/41004/41004h9999.nc

Or, reading it through xarray produces:

    <xarray.Dataset>
Dimensions:                  (latitude: 1, longitude: 1, time: 48)
Coordinates:
  * time                     (time) datetime64[ns] 2021-04-01T00:50:00 ... 20...
  * latitude                 (latitude) float32 31.4
  * longitude                (longitude) float32 -80.87
Data variables: (12/13)
    wind_dir                 (time, latitude, longitude) float64 ...
    wind_spd                 (time, latitude, longitude) float32 ...
    gust                     (time, latitude, longitude) float32 ...
    wave_height              (time, latitude, longitude) float32 ...
    dominant_wpd             (time, latitude, longitude) timedelta64[ns] ...
    average_wpd              (time, latitude, longitude) timedelta64[ns] ...
    ...                       ...
    air_pressure             (time, latitude, longitude) float32 ...
    air_temperature          (time, latitude, longitude) float32 ...
    sea_surface_temperature  (time, latitude, longitude) float32 ...
    dewpt_temperature        (time, latitude, longitude) float32 ...
    visibility               (time, latitude, longitude) float32 ...
    water_level              (time, latitude, longitude) float32 ...
Attributes:
    institution:  NOAA National Data Buoy Center and Participators in Data As...
    url:          http://dods.ndbc.noaa.gov
    quality:      Automated QC checks with manual editing and comprehensive m...
    conventions:  COARDS
    station:      41008
    comment:      GRAYS REEF - 40 NM Southeast of Savannah, GA
    location:     31.400 N 80.866 W 

I have tried converting to a pandas dataframe and writing to hdf5 file format but it is not easily manipulatable for much once the hdf5 is created. I also have not much experience working with hdf5 files compared to xarray and netcdf (was reusing a premade script which is why output was hdf5).

I've tried xarray.mf_dataset() which works but resulted in a 4 GB+ file when it should be around 100 MB and I also still had the issue of not keeping station name attribute data. I would prefer for this to be done in python (having issues using cdo and nco in Python currently) but can also run these commands from bash without issues.

If any more info is needed, please let me know.

Upvotes: 0

Views: 455

Answers (1)

Charlie Zender
Charlie Zender

Reputation: 6352

I suggest you try ncecat with group aggregation (gag), e.g.,

ncecat -7 --gag in*.nc out.nc

Followup to comment below:

As the referenced documentation says, this command places each input file in its entirety into its own group in the output file. You might think it "removed all of my data variables and values" if you did not examine the contents of the groups in the output, and just focused on the root level group (which contains only global metadata and subgroups). Use, e.g.,

ncks -m out.nc | more

to examine the subgroups:

zender@sastrugi:~/nco/data$ ncecat -O --gag 85.nc 86.nc 87.nc ~/foo.nc
zender@sastrugi:~/nco/data$ ncks -m -v lat ~/foo.nc | more
netcdf foo {
  group: \85 {
    dimensions:
      lat = 2 ;
      vrt_nbr = 2 ;

    variables:
      float lat(lat) ;
        lat:long_name = "Latitude (typically midpoints)" ;
        lat:units = "degrees_north" ;
        lat:bounds = "lat_bnd" ;

      float lat_bnd(lat,vrt_nbr) ;
        lat_bnd:purpose = "Cell boundaries for lat coordinate" ;
  } // group /85
  group: \86 {
    dimensions:
      lat = 2 ;
      vrt_nbr = 2 ;

    variables:
      float lat(lat) ;
        lat:long_name = "Latitude (typically midpoints)" ;
        lat:units = "degrees_north" ;
        lat:bounds = "lat_bnd" ;

      float lat_bnd(lat,vrt_nbr) ;
        lat_bnd:purpose = "Cell boundaries for lat coordinate" ;
  } // group /86
  group: \87 {
    dimensions:
      lat = 2 ;
      vrt_nbr = 2 ;

    variables:
      float lat(lat) ;
        lat:long_name = "Latitude (typically midpoints)" ;
        lat:units = "degrees_north" ;
        lat:bounds = "lat_bnd" ;

      float lat_bnd(lat,vrt_nbr) ;
        lat_bnd:purpose = "Cell boundaries for lat coordinate" ;
  } // group /87
} // group /

Upvotes: 2

Related Questions