Ann M
Ann M

Reputation: 249

Python - xarray mean between two netcdf files

I have yearly nc files, and each one of them contain daily min and max temperature data.

What I want to do, is to obtain the average temperature with those two variables.

I thought that with xarray would be easier, I've managed to merge all files into one like this:

import netCDF4 as nc
import numpy as np
import xarray

tmin = xarray.open_mfdataset('TMIN*.nc',combine = 'by_coords', concat_dim="time")


tmax = xarray.open_mfdataset('TMAX*.nc',combine = 'by_coords', concat_dim="time")

Then, I tried to do something like: tavg = (tmax - tmin) / 2

But I got an empty array (shown below):

<xarray.Dataset>
Dimensions:  (lat: 294, lon: 402, time: 25567)
Coordinates:
  * lat      (lat) float32 11.9125 11.995833 12.079166 ... 36.245834 36.329166
  * lon      (lon) float32 -119.4375 -119.354164 ... -86.104164 -86.020836
  * time     (time) datetime64[ns] 1950-01-01 1950-01-02 ... 2019-12-31
Data variables:
    *empty*

How can I get the mean between the two variables for each day?

As suggested, here are the summaries for both tmin and tmax:

<xarray.Dataset>
Dimensions:  (lat: 294, lon: 402, time: 25567)
Coordinates:
  * lon      (lon) float32 -119.4375 -119.354164 ... -86.104164 -86.020836
  * lat      (lat) float32 11.9125 11.995833 12.079166 ... 36.245834 36.329166
  * time     (time) datetime64[ns] 1950-01-01 1950-01-02 ... 2019-12-31
Data variables:
    TMAX     (time, lat, lon) float32 dask.array<chunksize=(365, 294, 402), meta=np.ndarray>


<xarray.Dataset>
Dimensions:  (lat: 294, lon: 402, time: 25567)
Coordinates:
  * lon      (lon) float32 -119.4375 -119.354164 ... -86.104164 -86.020836
  * lat      (lat) float32 11.9125 11.995833 12.079166 ... 36.245834 36.329166
  * time     (time) datetime64[ns] 1950-01-01 1950-01-02 ... 2019-12-31
Data variables:
    TMIN     (time, lat, lon) float32 dask.array<chunksize=(365, 294, 402), meta=np.ndarray>

Upvotes: 1

Views: 1965

Answers (1)

drcrisp
drcrisp

Reputation: 333

I think your problem is that Tmin and Tmax are datasets and not dataarrays.

If you try to add the two datasets together xarray does not know how to add the variables inside the dataset together. After all you can have multiple variables in one dataset.

To solve this you simply select the variables inside the datasets you would like to add.

import xarray as xr
import numpy as np

lon = np.arange(129.4, 153.75+0.05, 0.25)
lat = np.arange(-43.75, -10.1+0.05, 0.25)

Tmin = 10 * np.random.rand(len(lat), len(lon))
Tmax = 10 * np.random.rand(len(lat), len(lon))


Tmin = xr.Dataset({"Tmin": (["lat", "lon"], Tmin)},coords={"lon": lon,"lat": lat})
Tmax = xr.Dataset({"Tmax": (["lat", "lon"], Tmax)},coords={"lon": lon,"lat": lat})

# Just checking the datasets are not empty
print(Tmin)
print(Tmax)

# This will return an empty array as per your example 
tavg = (Tmax+Tmin)/2
print(tavg)

# Selecting the variable should work
tavg = (Tmax['Tmax']+Tmin['Tmin'])/2
print(tavg)

Upvotes: 1

Related Questions