Compute monthly covariance in xarray

Question

I have wind speed data in the form of xarray.DataArray:

u_250

Dims:
time: 600  latitude: 20  longitude: 40

Coordinates:
time      (time)      datetime64[ns]  1970-01-01 ... 2019-12-01
longitude (longitude) float64         101.0 103.0 105.0 ... 177.0 179.0
latitude  (latitude)  float64         1.0 3.0 5.0 7.0 ... 35.0 37.0 39.0

u_850

Dims:
time: 600  latitude: 20  longitude: 40

Coordinates:
time      (time)      datetime64[ns]  1970-01-01 ... 2019-12-01
longitude (longitude) float64         101.0 103.0 105.0 ... 177.0 179.0
latitude  (latitude)  float64         1.0 3.0 5.0 7.0 ... 35.0 37.0 39.0

u_250 and u_850 are the average wind speeds for every month between 1970.01 and 2019.12 (50 years)

Now, I want to compute the covariance at each grid point for each month. For example, for the point (latitude = 1.0, longitude = 101.0) in January, there are 50 data points for u_250 and u_850 respectively, and I want to calculate the covariance of these two variables.

I know that we can calculate the means or variances by u_250.groupby('time.month').mean() and u_250.groupby('time.month').var(), but how to compute the monthly covariance between two variables?

The expected results should be like this:

cov_u250_u850

month: 12  latitude: 20  longitude: 40

longitude  (longitude)  float64  101.0 103.0 105.0 ... 177.0 179.0
latitude   (latitude)   float64  1.0 3.0 5.0 7.0 ... 35.0 37.0 39.0
month      (month)      int64    1 2 3 4 5 6 7 8 9 10 11 12

Michael Delgado · Accepted Answer

You could bundle your arrays into a dataset and then use xr.Dataset.groupby with .apply and then apply xr.cov

In [3]: ds = xr.Dataset({"u_250": u_250, "u_850": u_850})
   ...: ds
Out[3]:

Dimensions:    (time: 49, latitude: 20, longitude: 40)
Coordinates:
  * time       (time) datetime64[ns] 1970-12-31 1971-12-31 ... 2018-12-31
  * latitude   (latitude) int64 1 3 5 7 9 11 13 15 ... 25 27 29 31 33 35 37 39
  * longitude  (longitude) int64 101 103 105 107 109 111 ... 171 173 175 177 179
Data variables:
    u_250      (time, latitude, longitude) float64 0.8598 0.1653 ... 0.8918
    u_850      (time, latitude, longitude) float64 0.8598 0.1653 ... 0.8918

In [4]: ds.groupby("time.month").apply(lambda x: xr.cov(x.u_250, x.u_850, dim="time"))
Out[4]:

array([[[0.09614746, 0.06594801, 0.07543877, 0.0778078 , 0.08283739,
         0.10066671, 0.09872022, 0.07863419, 0.07910119, 0.07795797,
         ...
         0.07400784, 0.08243132, 0.08108255, 0.08245004, 0.07669093,
         0.07766512, 0.07228637, 0.09536881, 0.09932027, 0.076379  ],
        [0.07673605, 0.09930399, 0.07511875, 0.08196243, 0.07823345,
...
         0.05127114, 0.09518304, 0.07020337, 0.07952586, 0.08139218],
        [0.08093504, 0.08978857, 0.07013655, 0.07011703, 0.08179119,
         0.07653206, 0.0754078 , 0.08776793, 0.08901924, 0.07567376,
         ...
         0.08344793, 0.0575026 , 0.07411288, 0.08003397, 0.08374315,
         0.08206529, 0.09083054, 0.09397327, 0.06969347, 0.07091056]]])
Coordinates:
  * latitude   (latitude) int64 1 3 5 7 9 11 13 15 ... 25 27 29 31 33 35 37 39
  * longitude  (longitude) int64 101 103 105 107 109 111 ... 171 173 175 177 179
  * month      (month) int64 12

See the GroupBy: Group and Bin Data section of the user guide for more info and the GroupBy section of the xarray tutorial for additional examples.

Compute monthly covariance in xarray

Answers (1)

Related Questions