Reputation: 391
I have two spatial dataset in netcdf format. They have same time, dimensions, coordinates, and data variable. But they are for different spatial coordinates. In below I tried to show my two dataset by a polygon:
import glob
import xarray as xr
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
file1 = '20190109T071048.nc'
file2 = '20190109T085117.nc'
ds1 = xr.open_dataset(file1, group='PRODUCT')
ds2 = xr.open_dataset(file2, group='PRODUCT')
PATH_TO_GPK = 'Study_Area.gpkg'
SA = gpd.read_file(PATH_TO_GPK, layer='Study_Area')
First dataset plot:
plt.figure(figsize=(12,8))
ax = plt.axes()
ds1.qa_value.isel(time = 0).plot(ax = ax, x='longitude', y='latitude')
SA.plot(ax = ax, alpha = 0.8, facecolor = 'none')
Second dataset plot:
plt.figure(figsize=(12,8))
ax = plt.axes()
ds2.qa_value.isel(time = 0).plot(ax = ax, x='longitude', y='latitude')
SA.plot(ax = ax, alpha = 0.8, facecolor = 'none')
I want to merge these two netcdf files with xarray.
combined = xr.merge([ds1, ds2], compat='no_conflicts')
Error:
MergeError: conflicting values for variable 'latitude' on objects to be combined. You can skip this check by specifying compat='override'.
tried with:
combined = xr.merge([ds1, ds2], compat='override')
but plot of combined
was same as above first plot.
Then tried with:
combined = xr.combine_by_coords([ds1,ds2], compat='no_conflicts')
Error:
Could not find any dimension coordinates to use to order the datasets for concatenation
Then tried with:
combined = xr.combine_nested([ds1,ds2], concat_dim=["time"])
and plot of combined
was again same as first plot.
Based on ThomasNicolas suggestion, I used below code:
ds = xr.open_mfdataset([file1, file2], combine='nested')
But it return this error:
AttributeError: 'Dataset' object has no attribute 'qa_value'
There are not any data in result:
print of the first dataset (for example) shows:
print (ds1)
<xarray.Dataset>
Dimensions: (corner: 4, ground_pixel: 450, scanline: 3245, time: 1)
Coordinates:
* scanline (scanline) float64 0.0 1.0 ... 3.244e+03
* ground_pixel (ground_pixel) float64 0.0 1.0 ... 449.0
* time (time) datetime64[ns] 2019-01-03
* corner (corner) float64 0.0 1.0 2.0 3.0
latitude (time, scanline, ground_pixel) float32 ...
longitude (time, scanline, ground_pixel) float32 ...
Data variables:
delta_time (time, scanline) timedelta64[ns] 08:07:0...
time_utc (time, scanline) object '2019-01-03T08:0...
qa_value (time, scanline, ground_pixel) float32 ...
Is there any suggestion for merge or combine of these files?
Base on @dl.meteo advice, I used satpy
library for solve my problem, it seems that it can merge two netcdf files but not completely, you can see an incorrect part (red boundary) in joined image.
Can satpy
do it correctly?
# Read NetCDF files
from satpy import Scene
import glob
filenames = glob.glob('myfiles*.nc')
scn = Scene(filenames=filenames, reader='tropomi_l2')
scn.load(['qq'])
mask = SA_mask_poly.mask(d, lat_name='latitude', lon_name='longitude')
out_sel = d.compute().where(mask == 0, drop=True)
plt.figure(figsize=(12,8))
ax = plt.axes()
out_sel.plot(ax = ax, x='longitude', y='latitude')
SA.plot(ax = ax, alpha = 0.8, facecolor = 'none', lw = 1)
Upvotes: 1
Views: 1196
Reputation: 31
I've come across this problem just now. xarray
can't combine values with different coordinates. As your two passes have their own unique coordinates, you can't directly combine them.
One solution for this is to use the pyresample
module to resample both granules from their own coordinates onto a common grid. Open each file as a separate Scene
and then apply scn.resample()
method. This will put both onto the same grid. From there you can combine them.
Another solution might be the experimental MultiScene
object, which is designed for this use case. As per the documentation:
Scene objects in Satpy are meant to represent a single geographic region at a specific single instant in time or range of time. This means they are not suited for handling multiple orbits of polar-orbiting satellite data, multiple time steps of geostationary satellite data, or other special data cases. To handle these cases Satpy provides the MultiScene class.
The reason your current solution has artefacts is your Scene
object has two separate orbits stuck together as one array. I think the discontinuity in their coordinates will cause stretch/tear artefacts in your quadmesh plot and further processing, such as convolution filtering, is likely to return unexpected results as it expects neighbouring values in an array to be physically neighbours in the final image and not in another orbit.
Upvotes: 1