Reputation: 123
I have some NetCDF time series data. With reading variables, I want to make a list of for global average value for each data. I wrote the code that is working, but it is not sexy. How can I write (loop) this code in better shape?
variables = 'name1 name2 name3 name4'.split()
name1 =[]
name2 =[]
name3 =[]
name4 =[]
for i in range (9,40):
name1_irrY = name1_aSrc[i].mean()
name1.append(name1_irrY)
name2_irrY = name2_aSrc[i].mean()
name2.append(name2_irrY)
name3_irrY = name3_aSrc[i].mean()
name3.append(name3_irrY)
name4_irrY = name4_aSrc[i].mean()
sname4.append(name4_irrY)
"name"_aSrc[i,:,:]
is variable of NetCDF.
since I have a lot of files, I need to sufficient way.
Upvotes: 0
Views: 1152
Reputation: 8085
Note that this question asks for a "global average value", implying that the averaging is a spatial average over lat and lon. In that case the use of mean() in the question, and also Bart's solution above is not correct, and will lead to very inaccurate results, see this this link for more details. Bart's example file is a function of height so this is not an issue for his particular file, but I think it was for the OP.
This is an example from the xarray page on how to apply weights.
import xarray as xr
ds = xr.tutorial.load_dataset("air_temperature")
weights = np.cos(np.deg2rad(air.lat))
weights.name = "weights"
air_weighted = air.weighted(weights)
weighted_mean = air_weighted.mean(("lon", "lat"))
Alternatively you can use the cdo package within python
from cdo import Cdo
cdo=Cdo()
res=cdo.fldmean(input="in.nc",output="out.nc")
Upvotes: 0
Reputation: 10248
I don't think you need the loop at all, since you can specify along which axis you want to calculate the mean. So something like this should be sufficient (this replaces the entire block of code that you posted):
name1 = np.mean(name1_aSrc[9:40,:,:], axis=(1,2))
name2 = np.mean(name2_aSrc[9:40,:,:], axis=(1,2))
# etc..
A small example with some NetCDF data that I had lying around:
import xarray as xr
import numpy as np
f = xr.open_dataset('u.xz.nc', decode_times=False)
u = f['u'].values
print(u.shape) # prints: (5, 96, 128, 1)
umean = np.mean(u, axis=(1,2,3))
print(umean.shape) # prints: (5,)
An alternative solution is to let xarray calculate the mean over a (named) dimension or multiple dimensions. Quick example with some other data:
import xarray as xr
import numpy as np
f = xr.open_dataset('drycblles_default_0000000.nc', decode_times=False)
# Original file has 3 dimensions:
print(f.dims) # prints Frozen(SortedKeysDict({'time': 37, 'z': 32, 'zh': 33}))
# Calculate mean over one single dimension:
fm1 = f.mean(dim='z')
print(fm1.dims) # prints Frozen(SortedKeysDict(OrderedDict([('time', 37), ('zh', 33)])))
# Calculate mean over multiple dimensions:
fm2 = f.mean(dim=['z','zh'])
print(fm2.dims) # prints Frozen(SortedKeysDict(OrderedDict([('time', 37)])))
fm1
and fm2
are again simply xarray datasets:
<xarray.Dataset>
Dimensions: (time: 37)
Coordinates:
* time (time) float64 0.0 300.0 600.0 900.0 ... 1.02e+04 1.05e+04 1.08e+04
Data variables:
iter (time) float64 0.0 5.0 10.0 15.0 20.0 ... 282.0 293.0 305.0 317.0
area (time) float64 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0 1.0
areah (time) float64 1.0 1.0 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0 1.0 1.0
th (time) float64 304.8 304.8 304.8 304.8 ... 305.1 305.1 305.1 305.1
th_3 (time) float64 1.246e-08 -3.435e-11 ... 7.017e-06 5.548e-05
Upvotes: 2