Reputation: 294508
consider the time series s
and it's index tidx
tidx = pd.date_range('2010-12-31', periods=3, freq='M')
s = pd.Series([0, 31, 59], tidx)
If I wanted to use s
as a lookup series and passed the date '2011-02-23'
, I'd want to get the most recently available value. In this case that would be 31
.
I've done
s.resample('D').ffill().loc['2011-02-23']
31
This does the job, but I had to resample the whole series just to get a single value. What is a more appropriate way to do this?
Upvotes: 3
Views: 156
Reputation: 221654
You could use searchsorted
-
s[s.index.searchsorted('2011-02-23','right')-1]
Fun is when you beat yourself! So, here's a bit more of NumPy into the mix for further performance boost -
s[s.index.values.searchsorted(np.datetime64('2011-02-23'),'right')-1]
Runtime test -
In [235]: tidx = pd.date_range('2010-12-31', periods=300, freq='M')
...: s = pd.Series(range(300), tidx)
...:
In [236]: s[s.index.searchsorted('2035-03-23','right')-1]
Out[236]: 290
In [237]: s[s.index.values.searchsorted(np.datetime64('2035-03-23'),'right')-1]
Out[237]: 290
In [238]: %timeit s[s.index.searchsorted('2035-03-23','right')-1]
10000 loops, best of 3: 63 µs per loop
In [239]: %timeit s[s.index.values.searchsorted(np.datetime64('2035-03-23'),'right')-1]
10000 loops, best of 3: 46.7 µs per loop
Upvotes: 6
Reputation: 294508
I used s.index.get_loc()
It allows to find the "closest" index value location.
s.iloc[s.index.get_loc('2011-02-23', 'ffill')]
Upvotes: 2
Reputation: 210932
what about this?
In [150]: s[s.index <= '2011-02-23'].tail(1)
Out[150]:
2011-01-31 31
Freq: M, dtype: int64
PS it'll work only if the index is sorted...
Upvotes: 2