Gonzalo
Gonzalo

Reputation: 1114

Comparing timedelta with dt.time in python

I've got a pandas series called df['inc_cr_date'] that has dates in format 2017-10-27 08:00:26.808.

I would like to compare if date and hour of this column is higher than 18:30. The problem is that if I use the below code:

#All the condtions can be reduced to one mask and result 
days_one = ['Monday','Tuesday','Wednesday','Thursday']
days_two = days_one + ['Friday']

# Returns a boolean mask 
m1 = df['inc_cr_date_day'].isin(days_one) & (df['inc_cr_date'].dt.hour > 18 ) & (df['inc_cr_date'].dt.minute > 30)
m2 = df['inc_cr_date_day'].isin(days_two) & (df['inc_cr_date'].dt.hour < 9 ) & (df['inc_cr_date'].dt.minute < 30)

# Repeated result can be stored in one variable 
r1 = (df['inc_cr_date']+pd.Timedelta('1 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes')
r2 = (df['inc_cr_date']+pd.Timedelta('0 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes')


df['inc_cr_date_adjusted'] = np.select([
                          m1, m2,      
                          (df['inc_cr_date_day'] == 'Saturday'),
                          (df['inc_cr_date_day'] == 'Sunday'),
                          ((df['inc_cr_date_day'] == 'Friday')& (df['inc_cr_date'].dt.hour > 18 ) & df['inc_cr_date'].dt.minute > 30),
                          ],
                          [r1, r2,
                          (df['inc_cr_date']+pd.Timedelta('2 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),
                          (df['inc_cr_date']+pd.Timedelta('1 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes'),                           
                          (df['inc_cr_date']+pd.Timedelta('3 days')).dt.normalize() + pd.Timedelta('9 Hours 30 Minutes')
                          ],
                          df['inc_cr_date'])

this will get me ambiguity results as first condition might be true but second could not be. How could I compare the ['inc_cr_date'] column with 18:30 and NOT with minutes and hour separated?

I've tried to use also pandas.DataFrame.between_time as below:

start = datetime.time(18,30,0)
end = datetime.time(23,59,0)

df['inc_cr_date'].between_time(start, end) )

But getting the error:

TypeError: Index must be DatetimeIndex

Even if I do:

df['inc_cr_date'] = pd.DatetimeIndex(df['inc_cr_date'])

Dataframe is like below:

  inc_cr_date                  inc_cr_date_day       
0 2017-10-26 21:59:28.075      Thursday 2017-10-27   
1 2017-10-21 16:49:58.722      Saturday 2017-10-23   
2 2017-10-11 09:30:05.258      Wednesday 2017-10-11   

Output should be like:

  inc_cr_date                  inc_cr_date_day        inc_cr_date_adjusted
0 2017-10-26 21:20:28.075      Thursday 2017-10-27    09:30:00.000
1 2017-10-21 16:49:58.722      Saturday 2017-10-23    09:30:00.000
2 2017-10-11 09:30:05.258      Wednesday 2017-10-11   09:30:05.258

Please advise. Thanks.

Upvotes: 1

Views: 1001

Answers (1)

ShreyasG
ShreyasG

Reputation: 806

How about this?

df['timeFlag'] = df['inc_cr_date'].apply(lambda x: 1 if x.time() > datetime.time(18, 30, 0) else 0)

In your example, you can change these lines:

# Returns a boolean mask 
m1 = df['inc_cr_date_day'].isin(days_one) & (df['inc_cr_date'].dt.hour > 18 ) & (df['inc_cr_date'].dt.minute > 30)
m2 = df['inc_cr_date_day'].isin(days_two) & (df['inc_cr_date'].dt.hour < 9 ) & (df['inc_cr_date'].dt.minute < 30)

to:

# Returns a boolean mask 
m1 = df['inc_cr_date_day'].isin(days_one) & (df['inc_cr_date'].apply(lambda x: 1 if x.time() > datetime.time(18, 30,0) else 0))
m2 = df['inc_cr_date_day'].isin(days_two) & (df['inc_cr_date'].apply(lambda x: 1 if x.time() > datetime.time(18, 30,0) else 0))

Upvotes: 1

Related Questions