The Great
The Great

Reputation: 7723

Strip timestamp info based on criteria using pandas

I have a dataframe like as shown below

df = pd.DataFrame({'login_datetime':['5/7/2013 09:27:00 AM','09/08/2013 11:21:00 AM','06/06/2014 08:00:00 AM','06/06/2014 05:00:00 AM','12/11/2011 10:00:00 AM'],
                   'logout_datetime':['15/7/2013 09:27:00 AM','09/08/2013','06/06/2014 08:00:00','06/06/2014','12/11/2011']})
df.login_datetime = pd.to_datetime(df.login_datetime)
df.logout_datetime = pd.to_datetime(df.logout_datetime)

I would like to replace the timestamp 00:00:00 in logout_datetime records with corresponding login_datetime timestamp when the dates are matching. Please note that 00:00:00 is not always displayed in jupyter notebook

For ex: for 2nd record the dates are matching but logout timestamp is 00:00:00 AM, So, I would like to change it to 11:21:00 AM

I tried the below but not sure how to strip the timestamp and paste it to other column (I don't wish to copy whole column)

 df['ts'] =  [datetime.datetime.time(d) for d in df['login_datetime']]
 df['new_logout_time'] = df['logout_datetime'] + df['ts']

I expect my output to be like as shown below

 login_datetime          logout_datetime
2013-05-07 09:27:00    2013-07-15 09:27:00
2013-09-08 11:21:00    2013-09-08 11:21:00
2014-06-06 08:00:00    2014-06-06 08:00:00
2014-06-06 05:00:00    2014-06-06 05:00:00
2011-12-11 10:00:00    2011-12-11 10:00:00

Upvotes: 1

Views: 62

Answers (2)

SeaBean
SeaBean

Reputation: 23217

You can use .mask() to conditionally copy the datetime from login_datetime to logout_datetime, as follows:

df['logout_datetime'] = df['logout_datetime'].mask(df['logout_datetime'].dt.date == df['login_datetime'].dt.date, df['login_datetime'])

Result:

print(df)


       login_datetime     logout_datetime
0 2013-05-07 09:27:00 2013-07-15 09:27:00
1 2013-09-08 11:21:00 2013-09-08 11:21:00
2 2014-06-06 08:00:00 2014-06-06 08:00:00
3 2014-06-06 05:00:00 2014-06-06 05:00:00
4 2011-12-11 10:00:00 2011-12-11 10:00:00

Edit

To extract the time of login_datetime, you can use:

df['ts'] = df['login_datetime'].dt.time

Upvotes: 1

jezrael
jezrael

Reputation: 862911

Convert times to timedeltas, so possible add to datetimes by conditions:

mask = df['logout_datetime'].dt.date == df['login_datetime'].dt.date
df['ts'] = pd.to_timedelta(df['login_datetime'].dt.time.astype(str))

df['new_logout_time'] = np.where(mask,
                                 df['logout_datetime'].dt.normalize() + df['ts'],
                                 df['logout_datetime'])
print(df)
       login_datetime     logout_datetime              ts     new_logout_time
0 2013-05-07 09:27:00 2013-07-15 09:27:00 0 days 09:27:00 2013-07-15 09:27:00
1 2013-09-08 11:21:00 2013-09-08 00:00:00 0 days 11:21:00 2013-09-08 11:21:00
2 2014-06-06 08:00:00 2014-06-06 08:00:00 0 days 08:00:00 2014-06-06 08:00:00
3 2014-06-06 05:00:00 2014-06-06 00:00:00 0 days 05:00:00 2014-06-06 05:00:00
4 2011-12-11 10:00:00 2011-12-11 00:00:00 0 days 10:00:00 2011-12-11 10:00:00

Or without new column:

mask = df['logout_datetime'].dt.date == df['login_datetime'].dt.date
ts = pd.to_timedelta(df['login_datetime'].dt.time.astype(str))

df['new_logout_time'] = np.where(mask,
                                 df['logout_datetime'].dt.normalize() + ts,
                                 df['logout_datetime'])
print(df)
       login_datetime     logout_datetime     new_logout_time
0 2013-05-07 09:27:00 2013-07-15 09:27:00 2013-07-15 09:27:00
1 2013-09-08 11:21:00 2013-09-08 00:00:00 2013-09-08 11:21:00
2 2014-06-06 08:00:00 2014-06-06 08:00:00 2014-06-06 08:00:00
3 2014-06-06 05:00:00 2014-06-06 00:00:00 2014-06-06 05:00:00
4 2011-12-11 10:00:00 2011-12-11 00:00:00 2011-12-11 10:00:00

Upvotes: 1

Related Questions