Reputation: 8659
Lets say I have a time series
import pandas as pd
from numpy.random import randn
dates = pd.date_range('12/31/2014', periods=10)
df = pd.DataFrame(randn(10, 4), index=dates, columns=['A', 'B', 'C', 'D'])
Given a date such as d ='1/5/2015'
how would I select the rows two days after d (days = 1/6/2015, 1/7/2015) and two days before d (days = 1/4/2015, 1/3/2015)? Is there a way to do this to ignore missing data from either weekends or holidays?
Upvotes: 2
Views: 781
Reputation: 40973
You can do it like this:
from pandas.tseries.offsets import BDay
d = pd.Timestamp('1/5/2015')
two_bdays_before = d - BDay(2) # business days
two_bdays_later = d + BDay(2)
Then to access all days between two_bdays_before
and two_bdays_later
:
>>> df[two_bdays_before:two_bdays_later]]
A B C D
2015-01-01 0.741045 -0.051576 0.228247 -0.429165
2015-01-02 -0.312247 -0.391012 -0.256515 -0.849694
2015-01-03 -0.581522 -1.472528 0.431249 0.673033
2015-01-04 -1.408855 0.564948 1.019376 2.986657
2015-01-05 -0.566606 -0.316533 1.201412 -1.390179
2015-01-06 -0.052672 0.293277 -0.566395 -1.591686
2015-01-07 -1.669806 1.699540 0.082697 -1.229178
Upvotes: 2
Reputation: 879361
df.index.get_loc(d)
returns an integer index corresponding to the date represented by the date string d
.
You can then use that integer index to select 2 rows before or after d
in df
:
import pandas as pd
import numpy as np
dates = pd.date_range('12/31/2014', periods=10)
df = pd.DataFrame(np.random.randn(10, 4), index=dates, columns=['A', 'B', 'C', 'D'])
d = '1/5/2015'
idx = df.index.get_loc(d)
print(df.iloc[idx+1:idx+3])
# A B C D
# 2015-01-06 1.211569 1.766432 0.153963 1.101142
# 2015-01-07 0.018377 0.112825 0.347711 -1.400145
print(df.iloc[idx-2:idx])
# A B C D
# 2015-01-03 -0.507956 -1.389623 -0.092228 -0.104655
# 2015-01-04 0.206824 1.226987 0.253424 -0.529778
Upvotes: 1