HMadadi
HMadadi

Reputation: 391

Spatial merge/combine of multiple netcdf with xarray or satpy

I have two spatial dataset in netcdf format. They have same time, dimensions, coordinates, and data variable. But they are for different spatial coordinates. In below I tried to show my two dataset by a polygon:

import glob
import xarray as xr 
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

file1 = '20190109T071048.nc'
file2 = '20190109T085117.nc'

ds1 = xr.open_dataset(file1, group='PRODUCT')
ds2 = xr.open_dataset(file2, group='PRODUCT')

PATH_TO_GPK = 'Study_Area.gpkg'
SA = gpd.read_file(PATH_TO_GPK, layer='Study_Area')

First dataset plot:

plt.figure(figsize=(12,8))
ax = plt.axes()
ds1.qa_value.isel(time = 0).plot(ax = ax, x='longitude', y='latitude')
SA.plot(ax = ax, alpha = 0.8, facecolor = 'none')

enter image description here

Second dataset plot:

plt.figure(figsize=(12,8))
ax = plt.axes()
ds2.qa_value.isel(time = 0).plot(ax = ax, x='longitude', y='latitude')
SA.plot(ax = ax, alpha = 0.8, facecolor = 'none')

enter image description here

I want to merge these two netcdf files with xarray.

combined = xr.merge([ds1, ds2], compat='no_conflicts')

Error:

MergeError: conflicting values for variable 'latitude' on objects to be combined. You can skip this check by specifying compat='override'.

tried with:

combined = xr.merge([ds1, ds2], compat='override')

but plot of combined was same as above first plot. Then tried with:

combined = xr.combine_by_coords([ds1,ds2], compat='no_conflicts')

Error:

Could not find any dimension coordinates to use to order the datasets for concatenation

Then tried with:

combined = xr.combine_nested([ds1,ds2], concat_dim=["time"])

and plot of combined was again same as first plot. Based on ThomasNicolas suggestion, I used below code:

ds = xr.open_mfdataset([file1, file2], combine='nested')

But it return this error:

AttributeError: 'Dataset' object has no attribute 'qa_value'

There are not any data in result:

enter image description here

print of the first dataset (for example) shows:

print (ds1)

<xarray.Dataset>
Dimensions:                          (corner: 4, ground_pixel: 450, scanline: 3245, time: 1)
Coordinates:
  * scanline                         (scanline) float64 0.0 1.0 ... 3.244e+03
  * ground_pixel                     (ground_pixel) float64 0.0 1.0 ... 449.0
  * time                             (time) datetime64[ns] 2019-01-03
  * corner                           (corner) float64 0.0 1.0 2.0 3.0
    latitude                         (time, scanline, ground_pixel) float32 ...
    longitude                        (time, scanline, ground_pixel) float32 ...
Data variables:
    delta_time                       (time, scanline) timedelta64[ns] 08:07:0...
    time_utc                         (time, scanline) object '2019-01-03T08:0...
    qa_value                         (time, scanline, ground_pixel) float32 ...  

Is there any suggestion for merge or combine of these files?

UPDATED

Base on @dl.meteo advice, I used satpy library for solve my problem, it seems that it can merge two netcdf files but not completely, you can see an incorrect part (red boundary) in joined image. Can satpy do it correctly?

# Read NetCDF files
from satpy import Scene
import glob
    
filenames = glob.glob('myfiles*.nc')
scn = Scene(filenames=filenames, reader='tropomi_l2')
scn.load(['qq'])

mask = SA_mask_poly.mask(d, lat_name='latitude', lon_name='longitude')
out_sel = d.compute().where(mask == 0, drop=True)

plt.figure(figsize=(12,8))
ax = plt.axes()
out_sel.plot(ax = ax, x='longitude', y='latitude')
SA.plot(ax = ax, alpha = 0.8, facecolor = 'none', lw = 1)

enter image description here

Upvotes: 1

Views: 1196

Answers (1)

Angoose
Angoose

Reputation: 31

I've come across this problem just now. xarray can't combine values with different coordinates. As your two passes have their own unique coordinates, you can't directly combine them.

One solution for this is to use the pyresample module to resample both granules from their own coordinates onto a common grid. Open each file as a separate Scene and then apply scn.resample() method. This will put both onto the same grid. From there you can combine them.

Another solution might be the experimental MultiScene object, which is designed for this use case. As per the documentation:

Scene objects in Satpy are meant to represent a single geographic region at a specific single instant in time or range of time. This means they are not suited for handling multiple orbits of polar-orbiting satellite data, multiple time steps of geostationary satellite data, or other special data cases. To handle these cases Satpy provides the MultiScene class.

The reason your current solution has artefacts is your Scene object has two separate orbits stuck together as one array. I think the discontinuity in their coordinates will cause stretch/tear artefacts in your quadmesh plot and further processing, such as convolution filtering, is likely to return unexpected results as it expects neighbouring values in an array to be physically neighbours in the final image and not in another orbit.

Upvotes: 1

Related Questions