Reputation: 7782
I have a set of financial data that only have data during weekdays. I want to find the next entry for each row on the next calendar date. So any Monday would find Tuesday, Tuesday would get Wednesday, BUT Friday would be blank because next day is Saturday which the market is not open. Same apply to holidays.
If I didn't care about the weekend then a simple shift would do ( Friday would get Monday's data), but since I do care about weekend I need to conditionally shift the entries only by one calendar day.
Example of the data I have, notice some weeks are short week.
date day_of_week day_of_month day_of_year month_of_year
5/1/2017 0 1 121 5
5/2/2017 1 2 122 5
5/3/2017 2 3 123 5
5/4/2017 3 4 124 5
5/8/2017 0 8 128 5
5/9/2017 1 9 129 5
5/10/2017 2 10 130 5
5/11/2017 3 11 131 5
5/12/2017 4 12 132 5
5/15/2017 0 15 135 5
5/16/2017 1 16 136 5
5/17/2017 2 17 137 5
5/18/2017 3 18 138 5
5/19/2017 4 19 139 5
5/23/2017 1 23 143 5
5/24/2017 2 24 144 5
5/25/2017 3 25 145 5
5/26/2017 4 26 146 5
Ideally what I want is this, notice some next_calendarday is blank due it being weekend or holiday. Theoretically Friday would never have any values since Saturday is never open.
date day_of_week day_of_month day_of_year month_of_year next_calendarday
5/1/2017 0 1 121 5 5/2/2017
5/2/2017 1 2 122 5 5/3/2017
5/3/2017 2 3 123 5 5/4/2017
5/4/2017 3 4 124 5
5/8/2017 0 8 128 5 5/9/2017
5/9/2017 1 9 129 5 5/10/2017
5/10/2017 2 10 130 5 5/11/2017
5/11/2017 3 11 131 5 5/12/2017
5/12/2017 4 12 132 5
5/15/2017 0 15 135 5 5/16/2017
5/16/2017 1 16 136 5 5/17/2017
5/17/2017 2 17 137 5 5/18/2017
5/18/2017 3 18 138 5 5/19/2017
5/19/2017 4 19 139 5
5/23/2017 1 23 143 5 5/24/2017
5/24/2017 2 24 144 5 5/25/2017
5/25/2017 3 25 145 5 5/26/2017
5/26/2017 4 26 146 5
Is there anyway to conditionally shift on the index value?
Upvotes: 1
Views: 514
Reputation: 627
You could achieve it with pd.DataFrame.apply()
holidays = ['2017-05-05']
weekend_days = [5, 6]
def find_next_calendardate(date):
date_str = str(date.date())
if date_str not in holidays and date.weekday() not in weekend_days:
return date
return None
df['next_calendardate'] = (
pd.to_datetime(df.date)
.add(pd.Timedelta(1, unit='d'))
.apply(find_next_calendardate)
)
Upvotes: 0
Reputation: 862761
It seems you need shifting values with set NaN
if difference of dayofweek is negative:
df['date'] = pd.to_datetime(df['date'])
s = df['date'].shift(-1)
df['next_calendarday'] = s.mask(s.dt.dayofweek.diff().lt(0))
print (df)
date day_of_week day_of_month day_of_year month_of_year \
0 2017-05-01 0 1 121 5
1 2017-05-02 1 2 122 5
2 2017-05-03 2 3 123 5
3 2017-05-04 3 4 124 5
4 2017-05-08 0 8 128 5
5 2017-05-09 1 9 129 5
6 2017-05-10 2 10 130 5
7 2017-05-11 3 11 131 5
8 2017-05-12 4 12 132 5
9 2017-05-15 0 15 135 5
10 2017-05-16 1 16 136 5
11 2017-05-17 2 17 137 5
12 2017-05-18 3 18 138 5
13 2017-05-19 4 19 139 5
14 2017-05-23 1 23 143 5
15 2017-05-24 2 24 144 5
16 2017-05-25 3 25 145 5
17 2017-05-26 4 26 146 5
next_calendarday
0 2017-05-02
1 2017-05-03
2 2017-05-04
3 NaT
4 2017-05-09
5 2017-05-10
6 2017-05-11
7 2017-05-12
8 NaT
9 2017-05-16
10 2017-05-17
11 2017-05-18
12 2017-05-19
13 NaT
14 2017-05-24
15 2017-05-25
16 2017-05-26
17 NaT
EDIT:
Solution with add 1 day:
df['date'] = pd.to_datetime(df['date'])
s = df['date'].add(pd.Timedelta(1, unit='d'))
df['next_calendarday'] = s.where(s.dt.dayofweek.diff(-1).eq(-1))
Upvotes: 1