Bill Software Engineer
Bill Software Engineer

Reputation: 7782

In Pandas, how to Shift date index to next calendar date

I have a set of financial data that only have data during weekdays. I want to find the next entry for each row on the next calendar date. So any Monday would find Tuesday, Tuesday would get Wednesday, BUT Friday would be blank because next day is Saturday which the market is not open. Same apply to holidays.

If I didn't care about the weekend then a simple shift would do ( Friday would get Monday's data), but since I do care about weekend I need to conditionally shift the entries only by one calendar day.

Example of the data I have, notice some weeks are short week.

date    day_of_week day_of_month    day_of_year month_of_year
5/1/2017    0   1   121 5
5/2/2017    1   2   122 5
5/3/2017    2   3   123 5
5/4/2017    3   4   124 5
5/8/2017    0   8   128 5
5/9/2017    1   9   129 5
5/10/2017   2   10  130 5
5/11/2017   3   11  131 5
5/12/2017   4   12  132 5
5/15/2017   0   15  135 5
5/16/2017   1   16  136 5
5/17/2017   2   17  137 5
5/18/2017   3   18  138 5
5/19/2017   4   19  139 5
5/23/2017   1   23  143 5
5/24/2017   2   24  144 5
5/25/2017   3   25  145 5
5/26/2017   4   26  146 5

Ideally what I want is this, notice some next_calendarday is blank due it being weekend or holiday. Theoretically Friday would never have any values since Saturday is never open.

date    day_of_week day_of_month    day_of_year month_of_year next_calendarday
5/1/2017    0   1   121 5   5/2/2017
5/2/2017    1   2   122 5   5/3/2017
5/3/2017    2   3   123 5   5/4/2017
5/4/2017    3   4   124 5
5/8/2017    0   8   128 5   5/9/2017
5/9/2017    1   9   129 5   5/10/2017 
5/10/2017   2   10  130 5   5/11/2017 
5/11/2017   3   11  131 5   5/12/2017
5/12/2017   4   12  132 5
5/15/2017   0   15  135 5   5/16/2017
5/16/2017   1   16  136 5   5/17/2017
5/17/2017   2   17  137 5   5/18/2017
5/18/2017   3   18  138 5   5/19/2017
5/19/2017   4   19  139 5
5/23/2017   1   23  143 5   5/24/2017
5/24/2017   2   24  144 5   5/25/2017
5/25/2017   3   25  145 5   5/26/2017
5/26/2017   4   26  146 5

Is there anyway to conditionally shift on the index value?

Upvotes: 1

Views: 514

Answers (2)

valentin
valentin

Reputation: 627

You could achieve it with pd.DataFrame.apply()

holidays = ['2017-05-05']
weekend_days = [5, 6]

def find_next_calendardate(date):
    date_str = str(date.date())
    if date_str not in holidays and date.weekday() not in weekend_days:
        return date
    
    return None

df['next_calendardate'] = (
    pd.to_datetime(df.date)
    .add(pd.Timedelta(1, unit='d'))
    .apply(find_next_calendardate)
)

Upvotes: 0

jezrael
jezrael

Reputation: 862761

It seems you need shifting values with set NaN if difference of dayofweek is negative:

df['date'] = pd.to_datetime(df['date'])

s = df['date'].shift(-1)
df['next_calendarday']  = s.mask(s.dt.dayofweek.diff().lt(0))

print (df)
         date  day_of_week  day_of_month  day_of_year  month_of_year  \
0  2017-05-01            0             1          121              5   
1  2017-05-02            1             2          122              5   
2  2017-05-03            2             3          123              5   
3  2017-05-04            3             4          124              5   
4  2017-05-08            0             8          128              5   
5  2017-05-09            1             9          129              5   
6  2017-05-10            2            10          130              5   
7  2017-05-11            3            11          131              5   
8  2017-05-12            4            12          132              5   
9  2017-05-15            0            15          135              5   
10 2017-05-16            1            16          136              5   
11 2017-05-17            2            17          137              5   
12 2017-05-18            3            18          138              5   
13 2017-05-19            4            19          139              5   
14 2017-05-23            1            23          143              5   
15 2017-05-24            2            24          144              5   
16 2017-05-25            3            25          145              5   
17 2017-05-26            4            26          146              5   

   next_calendarday  
0        2017-05-02  
1        2017-05-03  
2        2017-05-04  
3               NaT  
4        2017-05-09  
5        2017-05-10  
6        2017-05-11  
7        2017-05-12  
8               NaT  
9        2017-05-16  
10       2017-05-17  
11       2017-05-18  
12       2017-05-19  
13              NaT  
14       2017-05-24  
15       2017-05-25  
16       2017-05-26  
17              NaT  

EDIT:

Solution with add 1 day:

df['date'] = pd.to_datetime(df['date'])

s = df['date'].add(pd.Timedelta(1, unit='d'))
df['next_calendarday']  = s.where(s.dt.dayofweek.diff(-1).eq(-1))

Upvotes: 1

Related Questions