Reputation: 123

how do loop over NetCDF timeseries

I have some NetCDF time series data. With reading variables, I want to make a list of for global average value for each data. I wrote the code that is working, but it is not sexy. How can I write (loop) this code in better shape?

variables = 'name1 name2 name3 name4'.split()
name1  =[]
name2  =[]
name3  =[]
name4  =[]

for i in range (9,40):
    name1_irrY = name1_aSrc[i].mean()
    name1.append(name1_irrY)
    name2_irrY = name2_aSrc[i].mean()
    name2.append(name2_irrY)
    name3_irrY = name3_aSrc[i].mean()
    name3.append(name3_irrY)
    name4_irrY = name4_aSrc[i].mean()
    sname4.append(name4_irrY)

"name"_aSrc[i,:,:] is variable of NetCDF.

since I have a lot of files, I need to sufficient way.

Upvotes: 0

Answers (2)

ClimateUnboxed

Reputation: 8085

Note that this question asks for a "global average value", implying that the averaging is a spatial average over lat and lon. In that case the use of mean() in the question, and also Bart's solution above is not correct, and will lead to very inaccurate results, see this this link for more details. Bart's example file is a function of height so this is not an issue for his particular file, but I think it was for the OP.

This is an example from the xarray page on how to apply weights.

import xarray as xr
ds = xr.tutorial.load_dataset("air_temperature")    
weights = np.cos(np.deg2rad(air.lat))
weights.name = "weights"
air_weighted = air.weighted(weights)
weighted_mean = air_weighted.mean(("lon", "lat"))

Alternatively you can use the cdo package within python

from cdo import Cdo
cdo=Cdo()
res=cdo.fldmean(input="in.nc",output="out.nc")

Upvotes: 0

Bart

Reputation: 10248

I don't think you need the loop at all, since you can specify along which axis you want to calculate the mean. So something like this should be sufficient (this replaces the entire block of code that you posted):

name1 = np.mean(name1_aSrc[9:40,:,:], axis=(1,2))
name2 = np.mean(name2_aSrc[9:40,:,:], axis=(1,2))
# etc..

A small example with some NetCDF data that I had lying around:

import xarray as xr
import numpy as np

f = xr.open_dataset('u.xz.nc', decode_times=False)
u = f['u'].values

print(u.shape)  # prints: (5, 96, 128, 1)

umean = np.mean(u, axis=(1,2,3))

print(umean.shape) # prints: (5,)

An alternative solution is to let xarray calculate the mean over a (named) dimension or multiple dimensions. Quick example with some other data:

import xarray as xr
import numpy as np

f = xr.open_dataset('drycblles_default_0000000.nc', decode_times=False)

# Original file has 3 dimensions:
print(f.dims)   # prints Frozen(SortedKeysDict({'time': 37, 'z': 32, 'zh': 33}))

# Calculate mean over one single dimension:
fm1 = f.mean(dim='z')
print(fm1.dims)  # prints Frozen(SortedKeysDict(OrderedDict([('time', 37), ('zh', 33)])))

# Calculate mean over multiple dimensions:
fm2 = f.mean(dim=['z','zh'])
print(fm2.dims)  # prints Frozen(SortedKeysDict(OrderedDict([('time', 37)])))

fm1 and fm2 are again simply xarray datasets:

<xarray.Dataset>
Dimensions:  (time: 37)
Coordinates:
  * time     (time) float64 0.0 300.0 600.0 900.0 ... 1.02e+04 1.05e+04 1.08e+04
Data variables:
    iter     (time) float64 0.0 5.0 10.0 15.0 20.0 ... 282.0 293.0 305.0 317.0
    area     (time) float64 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0 1.0
    areah    (time) float64 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0 1.0
    th       (time) float64 304.8 304.8 304.8 304.8 ... 305.1 305.1 305.1 305.1
    th_3     (time) float64 1.246e-08 -3.435e-11 ... 7.017e-06 5.548e-05

Upvotes: 2

how do loop over NetCDF timeseries

Answers (2)

Related Questions