Reputation: 45
Recently I tried to read MODIS Cloud properties data. I tried to merge/ combine MOIDS NetCDF files however both ncrcat or CDO didn't work. Then I found that variables data in MODIS was collected in each group.
a='MCD06COSP_M3_MODIS.A2002182.061.2020181145824.nc'
b=nc.Dataset(a)
print(b.groups.keys())
c=b.groups['Cloud_Mask_Fraction']
print(c.variables['Mean'])
then it will give results
dict_keys(['Solar_Zenith', 'Solar_Azimuth', 'Sensor_Zenith', 'Sensor_Azimuth', 'Cloud_Top_Pressure', 'Cloud_Mask_Fraction', 'Cloud_Mask_Fraction_Low', 'Cloud_Mask_Fraction_Mid', 'Cloud_Mask_Fraction_High', 'Cloud_Optical_Thickness_Liquid', 'Cloud_Optical_Thickness_Ice', 'Cloud_Optical_Thickness_Total', 'Cloud_Optical_Thickness_PCL_Total', 'Cloud_Optical_Thickness_Log10_Liquid', 'Cloud_Optical_Thickness_Log10_Ice', 'Cloud_Optical_Thickness_Log10_Total', 'Cloud_Particle_Size_Liquid', 'Cloud_Particle_Size_Ice', 'Cloud_Water_Path_Liquid', 'Cloud_Water_Path_Ice', 'Cloud_Retrieval_Fraction_Liquid', 'Cloud_Retrieval_Fraction_Ice', 'Cloud_Retrieval_Fraction_Total'])
<class 'netCDF4._netCDF4.Variable'>
float64 Mean(longitude, latitude)
_FillValue: -999.0
title: Cloud_Mask_Fraction: Mean
units: none
path = /Cloud_Mask_Fraction
unlimited dimensions:
current shape = (360, 180)
filling on
There are other variables in many groups, and I need to read all the other files or merge these files. So I am wondering how can I read multiple NetCDF files with groups? How can I have arrays for each variable with a new dimension time since I have to read these data for years? Does CDO or ncrcat or xarray in python can merge this kind of nc files?
Thanks a lot. Yuhang
Upvotes: 0
Views: 949
Reputation: 1786
I would recommend to use xarray as the state-of-the-art 4D grid data handler in python.
You have to install netcdf4 and I recommend h5netcdf because of faster processing.
path_to_file = 'MCD06COSP_M3_MODIS.A2002182.061.2020181145824.nc'
# if h5netcdf is installed:
data = xarray.open_dataset(path_to_file, engine='h5netcdf')
# if just netcdf4 is installed:
data = xarray.open_dataset(path_to_file)
# access variables:
data[<variable_name>]
data.<variable_name>
# inspect whole file:
data
You can load multiple files into one dataset:
datasets = xarray.open_datasets([path_to_file_1, path_to_file_2], parallel=True)
I expect some errors in case that you have different time spans but you can find ways to work around such an issue.
I added parallelisation here to enhance the parsing speed. Please add test data via link to a cloud storage or similar otherwise the community can not help you more like these suggestions.
PS: Please choose variable names wisely ;)
Upvotes: 1