Feng Hu
Feng Hu

Reputation: 63

Compute monthly covariance in xarray

I have wind speed data in the form of xarray.DataArray:

u_250

Dims:
time: 600  latitude: 20  longitude: 40

Coordinates:
time      (time)      datetime64[ns]  1970-01-01 ... 2019-12-01
longitude (longitude) float64         101.0 103.0 105.0 ... 177.0 179.0
latitude  (latitude)  float64         1.0 3.0 5.0 7.0 ... 35.0 37.0 39.0

u_850

Dims:
time: 600  latitude: 20  longitude: 40

Coordinates:
time      (time)      datetime64[ns]  1970-01-01 ... 2019-12-01
longitude (longitude) float64         101.0 103.0 105.0 ... 177.0 179.0
latitude  (latitude)  float64         1.0 3.0 5.0 7.0 ... 35.0 37.0 39.0

u_250 and u_850 are the average wind speeds for every month between 1970.01 and 2019.12 (50 years)

Now, I want to compute the covariance at each grid point for each month. For example, for the point (latitude = 1.0, longitude = 101.0) in January, there are 50 data points for u_250 and u_850 respectively, and I want to calculate the covariance of these two variables.

I know that we can calculate the means or variances by u_250.groupby('time.month').mean() and u_250.groupby('time.month').var(), but how to compute the monthly covariance between two variables?

The expected results should be like this:

cov_u250_u850

month: 12  latitude: 20  longitude: 40

longitude  (longitude)  float64  101.0 103.0 105.0 ... 177.0 179.0
latitude   (latitude)   float64  1.0 3.0 5.0 7.0 ... 35.0 37.0 39.0
month      (month)      int64    1 2 3 4 5 6 7 8 9 10 11 12

Upvotes: 1

Views: 280

Answers (1)

Michael Delgado
Michael Delgado

Reputation: 15432

You could bundle your arrays into a dataset and then use xr.Dataset.groupby with .apply and then apply xr.cov

In [3]: ds = xr.Dataset({"u_250": u_250, "u_850": u_850})
   ...: ds
Out[3]:
<xarray.Dataset>
Dimensions:    (time: 49, latitude: 20, longitude: 40)
Coordinates:
  * time       (time) datetime64[ns] 1970-12-31 1971-12-31 ... 2018-12-31
  * latitude   (latitude) int64 1 3 5 7 9 11 13 15 ... 25 27 29 31 33 35 37 39
  * longitude  (longitude) int64 101 103 105 107 109 111 ... 171 173 175 177 179
Data variables:
    u_250      (time, latitude, longitude) float64 0.8598 0.1653 ... 0.8918
    u_850      (time, latitude, longitude) float64 0.8598 0.1653 ... 0.8918

In [4]: ds.groupby("time.month").apply(lambda x: xr.cov(x.u_250, x.u_850, dim="time"))
Out[4]:
<xarray.DataArray (month: 1, latitude: 20, longitude: 40)>
array([[[0.09614746, 0.06594801, 0.07543877, 0.0778078 , 0.08283739,
         0.10066671, 0.09872022, 0.07863419, 0.07910119, 0.07795797,
         ...
         0.07400784, 0.08243132, 0.08108255, 0.08245004, 0.07669093,
         0.07766512, 0.07228637, 0.09536881, 0.09932027, 0.076379  ],
        [0.07673605, 0.09930399, 0.07511875, 0.08196243, 0.07823345,
...
         0.05127114, 0.09518304, 0.07020337, 0.07952586, 0.08139218],
        [0.08093504, 0.08978857, 0.07013655, 0.07011703, 0.08179119,
         0.07653206, 0.0754078 , 0.08776793, 0.08901924, 0.07567376,
         ...
         0.08344793, 0.0575026 , 0.07411288, 0.08003397, 0.08374315,
         0.08206529, 0.09083054, 0.09397327, 0.06969347, 0.07091056]]])
Coordinates:
  * latitude   (latitude) int64 1 3 5 7 9 11 13 15 ... 25 27 29 31 33 35 37 39
  * longitude  (longitude) int64 101 103 105 107 109 111 ... 171 173 175 177 179
  * month      (month) int64 12

See the GroupBy: Group and Bin Data section of the user guide for more info and the GroupBy section of the xarray tutorial for additional examples.

Upvotes: 1

Related Questions