Gonzalo
Gonzalo

Reputation: 1114

new date column for pandas df with conditions

Could you please elucidate me on what I am doing wrong here. I want to create a new date column where the new date will be same as column "inc_cr_date" if the "inc_cr_date_day" is not "Sunday" or "Saturday". When is Saturday or Sunday I want the new column to be "inc_cr_date" +1 and +2 days respectively.

df is:

inc_cr_date              inc_cr_date_day
2017-10-26 21:59:28.075  Thursday
2017-10-27 08:17:02.718  Friday
2017-10-22 14:44:07.695  Sunday

code (updated):

df['inc_cr_date_day'] = df['inc_cr_date'].dt.weekday_name

df['inc_cr_date_adjusted'] = np.select([(df['inc_cr_date_day'] == 'Saturday'),#condition working
                              (df['inc_cr_date_day'] == 'Sunday'),#condition working
                              ((df['inc_cr_date_day'] == 'Monday')& (df['inc_cr_date'].dt.hour > 18 ) & df['inc_cr_date'].dt.minute > 30),
                              ((df['inc_cr_date_day'] == 'Monday')& (df['inc_cr_date'].dt.hour < 9 ) & df['inc_cr_date'].dt.minute < 30),
                              ((df['inc_cr_date_day'] == 'Tuesday')& (df['inc_cr_date'].dt.hour > 18 ) & df['inc_cr_date'].dt.minute > 30),
                              ((df['inc_cr_date_day'] == 'Tuesday')& (df['inc_cr_date'].dt.hour < 9 ) & df['inc_cr_date'].dt.minute < 30),
                              ((df['inc_cr_date_day'] == 'Wednesday')& (df['inc_cr_date'].dt.hour > 18 ) & df['inc_cr_date'].dt.minute > 30),
                              ((df['inc_cr_date_day'] == 'Wednesday')& (df['inc_cr_date'].dt.hour < 9 ) & df['inc_cr_date'].dt.minute < 30),
                              ((df['inc_cr_date_day'] == 'Thursday')& (df['inc_cr_date'].dt.hour > 18 ) & df['inc_cr_date'].dt.minute > 30),
                              ((df['inc_cr_date_day'] == 'Thursday')& (df['inc_cr_date'].dt.hour < 9 ) & df['inc_cr_date'].dt.minute < 30),
                              ((df['inc_cr_date_day'] == 'Friday')& (df['inc_cr_date'].dt.hour > 18 ) & df['inc_cr_date'].dt.minute > 30),
                              ((df['inc_cr_date_day'] == 'Friday')& (df['inc_cr_date'].dt.hour < 9 ) & df['inc_cr_date'].dt.minute < 30)],


                           [(df['inc_cr_date']+pd.Timedelta('2 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                            (df['inc_cr_date']+pd.Timedelta('1 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                            (df['inc_cr_date']+pd.Timedelta('1 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                            (df['inc_cr_date']+pd.Timedelta('0 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                            (df['inc_cr_date']+pd.Timedelta('1 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),                            
                            (df['inc_cr_date']+pd.Timedelta('0 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                            (df['inc_cr_date']+pd.Timedelta('1 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                            (df['inc_cr_date']+pd.Timedelta('0 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                            (df['inc_cr_date']+pd.Timedelta('1 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                            (df['inc_cr_date']+pd.Timedelta('0 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                            (df['inc_cr_date']+pd.Timedelta('3 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                            (df['inc_cr_date']+pd.Timedelta('0 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes')],

                           df['inc_cr_date'])

Thanks a lot for your inputs.

Upvotes: 1

Views: 68

Answers (1)

Bharath M Shetty
Bharath M Shetty

Reputation: 30605

What you are doing the questions posted is adding timedelta to a list ['inc_cr_date'] + pd.DateOffset(1). Try it with row['inc_cr_date'] + pd.DateOffset(1)

So instead of if, elseif, else we can use the power of np.select, and + pd.Timedelta i.e

df['inc_cr_date2'] = np.select([(df['inc_cr_date_day'] == 'Saturday'),
                                (df['inc_cr_date_day'] == 'Sunday')],

                               [df['inc_cr_date']+pd.Timedelta('1 days'),
                                df['inc_cr_date']+pd.Timedelta('2 days')],

                                df['inc_cr_date']) 
             inc_cr_date inc_cr_date_day            inc_cr_date2
0 2017-10-26 21:59:28.075        Thursday 2017-10-26 21:59:28.075
1 2017-10-27 08:17:02.718          Friday 2017-10-27 08:17:02.718
2 2017-10-22 14:44:07.695          Sunday 2017-10-24 14:44:07.695

For specific time we can use normalize and add timedelta of hours you want ie.

df['inc_cr_date2'] = np.select([(df['inc_cr_date_day'] == 'Saturday'),
                              (df['inc_cr_date_day'] == 'Sunday')],

                           [(df['inc_cr_date']+pd.Timedelta('1 days')).dt.normalize() + pd.Timedelta('9 Hours'),
                            (df['inc_cr_date']+pd.Timedelta('2 days')).dt.normalize() + pd.Timedelta('9 Hours')],

                           df['inc_cr_date']) 
           inc_cr_date inc_cr_date_day            inc_cr_date2
0 2017-10-26 21:59:28.075        Thursday 2017-10-26 21:59:28.075
1 2017-10-27 08:17:02.718          Friday 2017-10-27 08:17:02.718
2 2017-10-22 14:44:07.695          Sunday 2017-10-24 09:00:00.000

Upvotes: 3

Related Questions