Reputation: 3515
I have two DataArray objects, called "A
" and "B
".
Besides Latitude
and Longitude
, both of them have a time
dimension denoting daily data. A
has a smaller time coordinates than B
.
A's time dimension:
<xarray.DataArray 'time' (time: 1422)>
array(['2015-03-30T00:00:00.000000000', '2015-06-14T00:00:00.000000000',
'2015-06-16T00:00:00.000000000', ..., '2019-08-31T00:00:00.000000000',
'2019-09-01T00:00:00.000000000', '2019-09-02T00:00:00.000000000'],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 2015-03-30 2015-06-14 ... 2019-09-02
B's time dimension:
<xarray.DataArray 'time' (time: 16802)>
array(['1972-01-01T00:00:00.000000000', '1972-01-02T00:00:00.000000000',
'1972-01-03T00:00:00.000000000', ..., '2017-12-29T00:00:00.000000000',
'2017-12-30T00:00:00.000000000', '2017-12-31T00:00:00.000000000'],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 1972-01-01 1972-01-02 ... 2017-12-31
Obviously, the A's time
dimension is a subset of B's time
dimension. I would like to select data from B using the all the time
labels from A. As the time in A is not continuous I don't think slice
is suitable. So I tried using sel
.
B_sel = B.sel(time=A.time)
I received an error: KeyError: "not all values found in index 'time'"
Upvotes: 2
Views: 4989
Reputation: 41
A_new = A.where(A.time.isin(B.time), drop=True)
http://xarray.pydata.org/en/stable/user-guide/indexing.html
Upvotes: 4
Reputation: 1800
Obviously, the A's time dimension is a subset of B's time dimension.
I received an error: KeyError: "not all values found in index 'time'"
The error message is suggestive in itself that the assumption made in statement one is wrong. Also, if you look at your time values carefully A
has values until 2019 whereas B
ends in 2017.
So, there are 2 ways to solve this:
If you're sure that A has all the values in B up till 2017 then
sel_dates = A.time.values[A.time.dt.year < 2017]
B_sel = B.sel(time=sel_dates)
If you're not sure about the values in A being continuous because of some unexpected values in somewhere then you can perform an element-wise check using np.isin()
which is one of the speed-optimised numpy
functions
sel_dates = A.time.values[np.isin(A.time.values, B.time.values)]
## example ##
## dates1 is an array of daily dates of 1 month
dates1 = np.arange('2005-02', '2005-03', dtype='datetime64[D]')
dates2 = np.array(['2005-02-03', '2002-02-05', '2000-01-05'], dtype='datetime64')
# checking for dates2 which are a part of dates 1
print(np.isin(dates2, dates1))
>>array([ True, False, False])
Upvotes: 0