Geoff D
Geoff D

Reputation: 385

Adding and using additional coordinates with xarray

I'm learning how to use the python xarray package, however, I'm having troubles with multi-dimensional data. Specifically, how to add and use additional coordinates?

Here's an example.

import xarray as xr
import pandas as pd
import numpy as np

site_id = ['brw','sum','mlo']
dss = []
for site in site_id:
    df = pd.DataFrame(np.random.randn(20,2),columns=['a','b'],index=pd.date_range('20160101',periods=20,freq='MS'))
    ds = df.to_xarray()
    dss.append(ds)

ds = xr.concat(dss, dim=pd.Index(site_id, name='site'))
ds.coords['latitude'] = [71.323, 72.58, 19.5362]
ds.coords['longitude'] = [156.6114, 38.48, 155.5763]

My xarray data set looks like:

>>> ds
<xarray.Dataset>
Dimensions:    (index: 20, latitude: 3, longitude: 3, site: 3)
Coordinates:
  * index      (index) datetime64[ns] 2016-01-01 2016-02-01 2016-03-01 ...
  * site       (site) object 'brw' 'sum' 'mlo'
  * latitude   (latitude) float64 71.32 72.58 19.54
  * longitude  (longitude) float64 156.6 38.48 155.6
Data variables:
    a          (site, index) float64 -0.1403 -0.2225 -1.199 -0.8916 0.1149 ...
    b          (site, index) float64 -1.506 0.9106 -0.7359 2.123 -0.1987 ...

I can select a series by using the sel method based on a site code. For example:

>>> ds.sel(site='mlo')

But how do I select data based on the other coordinates (i.e. latitude or longitude)?

>>> ds.sel(latitude>50)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'latitude' is not defined

Upvotes: 3

Views: 2791

Answers (2)

Philipe Riskalla Leal
Philipe Riskalla Leal

Reputation: 1066

Another solution for selecting data through "sel" method would be using the "slice" object of Python.

So, in order to select data from a Xarray object whose latitude is greater than a given value (i.e. 50 degrees north), one could write the following:

   ds.sel(dict(latitude=slice(50,None)))

I hope it helps.

Sincerely,

Upvotes: 1

Maximilian
Maximilian

Reputation: 8450

Thanks for the easy-to-reproduce example!

You can only use .sel(x=y) with =, because of the limitations of python. An example using .isel with latitude (sel is harder because it's a float type):

In [7]: ds.isel(latitude=0)
Out[7]:
<xarray.Dataset>
Dimensions:    (index: 20, longitude: 3, site: 3)
Coordinates:
  * index      (index) datetime64[ns] 2016-01-01 2016-02-01 2016-03-01 ...
  * site       (site) object 'brw' 'sum' 'mlo'
    latitude   float64 71.32
  * longitude  (longitude) float64 156.6 38.48 155.6
Data variables:
    a          (site, index) float64 0.6493 -0.9105 -0.9963 -0.6206 0.6856 ...
    b          (site, index) float64 -0.03405 -1.49 0.2646 -0.3073 0.6326 ...

To use conditions such as >, you can use .where:

In [9]: ds.where(ds.latitude>50, drop=True)
Out[9]:
<xarray.Dataset>
Dimensions:    (index: 20, latitude: 2, longitude: 3, site: 3)
Coordinates:
  * index      (index) datetime64[ns] 2016-01-01 2016-02-01 2016-03-01 ...
  * site       (site) object 'brw' 'sum' 'mlo'
  * latitude   (latitude) float64 71.32 72.58
  * longitude  (longitude) float64 156.6 38.48 155.6
Data variables:
    a          (site, index, latitude) float64 0.6493 0.6493 -0.9105 -0.9105 ...
    b          (site, index, latitude) float64 -0.03405 -0.03405 -1.49 -1.49 ...

Upvotes: 4

Related Questions