clifgray
clifgray

Reputation: 4419

open_mfdataset with xarray failing to find coordinates

I'm attempting to download a bunch of GOES-16 radiance data and open it all together in xarray to analyze with the xr.open_mfdataset() function. These netcdf files have a coordinate t that is the time stamp that I'm trying to use as a joining but I'm getting the error ValueError: Could not find any dimension coordinates to use to order the datasets for concatenation when I try to do this. Here is my code along with links to download two example .nc files.

Download two files with:

wget https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2019/141/02/OR_ABI-L1b-RadF-M6C14_G16_s20191410240370_e20191410250078_c20191410250143.nc
wget https://noaa-goes16.s3.amazonaws.com/ABI-L1b-RadF/2019/141/03/OR_ABI-L1b-RadF-M6C14_G16_s20191410310370_e20191410320078_c20191410320142.nc

And the code:

import xarray as xr
ds_sst = xr.open_mfdataset("OR_ABI-L1b-RadF*nc", concat_dim='t',combine='by_coords')

Is there anything I can do to make this work so I can open a couple dozen of these files together?

Upvotes: 3

Views: 7361

Answers (1)

hyperpiano
hyperpiano

Reputation: 444

Use combine='nested' instead.

From the Xarray documentation on combining by coords:

Attempt to auto-magically combine the given datasets into one by using dimension coordinates.

't' is not a dimension coordinate, so the xarray magic doesn't work in this case, because xarray's combine_by_coords looks for matching dimension coordinates between the imported netcdfs.

In this case you need to be more specific: use combine = 'nested' and specify the new dimension name with concat_dim='t'. As there is already a coordinate named 't' xarray will automatically promote it to dimension coordinate.

ds_sst = xr.open_mfdataset("OR_ABI-L1b-RadF*nc", concat_dim='t', combine='nested')

The resulting dataset looks like this.

<xarray.Dataset>
Dimensions:                                           (band: 1, num_star_looks: 24, number_of_image_bounds: 2, number_of_time_bounds: 2, t: 2, x: 5424, y: 5424)
Coordinates:
    band_wavelength_star_look                         (num_star_looks) float32 dask.array<chunksize=(24,), meta=np.ndarray>
    x_image                                           float32 0.0
    y_image                                           float32 0.0
    band_wavelength                                   (band) float32 dask.array<chunksize=(1,), meta=np.ndarray>
    band_id                                           (band) int8 dask.array<chunksize=(1,), meta=np.ndarray>
    t_star_look                                       (num_star_looks) datetime64[ns] dask.array<chunksize=(24,), meta=np.ndarray>
  * y                                                 (y) float32 0.151844 ... -0.151844
  * x                                                 (x) float32 -0.151844 ... 0.151844
  * t                                                 (t) datetime64[ns] 2019-05-21T02:45:22.400760064 2019-05-21T03:15:22.406056960
Dimensions without coordinates: band, num_star_looks, number_of_image_bounds, number_of_time_bounds
Data variables:
    Rad                                               (t, y, x) float32 dask.array<chunksize=(1, 5424, 5424), meta=np.ndarray>
    DQF                                               (t, y, x) float32 dask.array<chunksize=(1, 5424, 5424), meta=np.ndarray>
    time_bounds                                       (t, number_of_time_bounds) datetime64[ns] dask.array<chunksize=(1, 2), meta=np.ndarray>
    goes_imager_projection                            (t) int32 -2147483647 -2147483647
    y_image_bounds                                    (t, number_of_image_bounds) float32 dask.array<chunksize=(1, 2), meta=np.ndarray>
    x_image_bounds                                    (t, number_of_image_bounds) float32 dask.array<chunksize=(1, 2), meta=np.ndarray>
    nominal_satellite_subpoint_lat                    (t) float64 0.0 0.0
    nominal_satellite_subpoint_lon                    (t) float64 -75.2 -75.2
    nominal_satellite_height                          (t) float64 3.579e+04 3.579e+04
    geospatial_lat_lon_extent                         (t) float32 9.96921e+36 9.96921e+36
    yaw_flip_flag                                     (t) float64 0.0 0.0
    esun                                              (t) float64 nan nan
    kappa0                                            (t) float64 nan nan
    planck_fk1                                        (t) float64 8.51e+03 8.51e+03
    planck_fk2                                        (t) float64 1.286e+03 1.286e+03
    planck_bc1                                        (t) float64 0.2252 0.2252
    planck_bc2                                        (t) float64 0.9992 0.9992
    valid_pixel_count                                 (t) float64 2.305e+07 2.305e+07
    missing_pixel_count                               (t) float64 268.0 290.0
    saturated_pixel_count                             (t) float64 0.0 0.0
    undersaturated_pixel_count                        (t) float64 0.0 0.0
    focal_plane_temperature_threshold_exceeded_count  (t) float64 0.0 0.0
    min_radiance_value_of_valid_pixels                (t) float64 8.217 8.472
    max_radiance_value_of_valid_pixels                (t) float64 125.5 123.2
    mean_radiance_value_of_valid_pixels               (t) float64 82.01 81.96
    std_dev_radiance_value_of_valid_pixels            (t) float64 24.64 24.53
    maximum_focal_plane_temperature                   (t) float64 62.12 62.12
    focal_plane_temperature_threshold_increasing      (t) float64 81.0 81.0
    focal_plane_temperature_threshold_decreasing      (t) float64 81.0 81.0
    percent_uncorrectable_L0_errors                   (t) float64 0.0 0.0
    earth_sun_distance_anomaly_in_AU                  (t) float64 1.012 1.012
    algorithm_dynamic_input_data_container            (t) int32 -2147483647 -2147483647
    processing_parm_version_container                 (t) int32 -2147483647 -2147483647
    algorithm_product_version_container               (t) int32 -2147483647 -2147483647
    star_id                                           (t, num_star_looks) float32 dask.array<chunksize=(1, 24), meta=np.ndarray>
Attributes:
    naming_authority:          gov.nesdis.noaa
    Conventions:               CF-1.7
    Metadata_Conventions:      Unidata Dataset Discovery v1.0
    standard_name_vocabulary:  CF Standard Name Table (v35, 20 July 2016)
    institution:               DOC/NOAA/NESDIS > U.S. Department of Commerce,...
    project:                   GOES
    production_site:           WCDAS
    production_environment:    OE
    spatial_resolution:        2km at nadir
    orbital_slot:              GOES-East
    platform_ID:               G16
    instrument_type:           GOES R Series Advanced Baseline Imager
    scene_id:                  Full Disk
    instrument_ID:             FM1
    title:                     ABI L1b Radiances
    summary:                   Single emissive band ABI L1b Radiance Products...
    keywords:                  SPECTRAL/ENGINEERING > INFRARED WAVELENGTHS > ...
    keywords_vocabulary:       NASA Global Change Master Directory (GCMD) Ear...
    iso_series_metadata_id:    a70be540-c38b-11e0-962b-0800200c9a66
    license:                   Unclassified data.  Access is restricted to ap...
    processing_level:          National Aeronautics and Space Administration ...
    cdm_data_type:             Image
    dataset_name:              OR_ABI-L1b-RadF-M6C14_G16_s20191410240370_e201...
    production_data_source:    Realtime
    timeline_id:               ABI Mode 6
    date_created:              2019-05-21T02:50:14.3Z
    time_coverage_start:       2019-05-21T02:40:37.0Z
    time_coverage_end:         2019-05-21T02:50:07.8Z
    id:                        abb3657a-03c0-47a9-a1ba-f3196c07c5a9

Alternatively, you can define a function that promotes the coordinate 't' to a dimension coordinate and pass it to the preprocess argument in open_mfdataset. This function is applied to every imported NetCDF before it's concatenated with the others.

def preprocessing(ds): 
    return ds.expand_dims(dim='t')

ds_sst = xr.open_mfdataset("OR_ABI-L1b-RadF*nc", concat_dim='t',combine='by_coords', preprocess = preprocessing)

The result is the same as above.

Upvotes: 5

Related Questions