The Great
The Great

Reputation: 7693

How to replace timestamp across the columns using pandas

df = pd.DataFrame({
 'subject_id':[1,1,2,2],
 'time_1':['2173/04/11 12:35:00','2173/04/12 12:50:00','2173/04/11 12:59:00','2173/04/12 13:14:00'],
 'time_2':['2173/04/12 16:35:00','2173/04/13 18:50:00','2173/04/13 22:59:00','2173/04/21 17:14:00'],
  'val' :[5,5,40,40],
  'iid' :[12,12,12,12]   
 })
df['time_1'] = pd.to_datetime(df['time_1'])
df['time_2'] = pd.to_datetime(df['time_2'])
df['day'] = df['time_1'].dt.day

Currently my dataframe looks like as shown below

enter image description here

I would like to replace the timestamp in time_1 column to 00:00:00 and time_2 column to 23:59:00

This is what I tried but it doesn't work

df.groupby(df['subject_id'])['time_1'].apply(lambda x: pd.datetime.strftime(x, "%H:%M:%S") == "00:00:00")   #approach 1

df.groupby(df['subject_id'])['time_1'].apply(lambda x: pd.pd.Timestamp(hour = '00', second = '00')) #approach 2

I expect my output dataframe to be like as shown below

enter image description here

Upvotes: 1

Views: 1083

Answers (1)

jezrael
jezrael

Reputation: 862406

I pandas if all datetimes have 00:00:00 times in same column then not display it.

Use Series.dt.floor or Series.str.normalize for remove times and for second add DateOffset:

df['time_1'] = pd.to_datetime(df['time_1']).dt.floor('d')
#alternative
#df['time_1'] = pd.to_datetime(df['time_1']).dt.normalize()
df['time_2']=pd.to_datetime(df['time_2']).dt.floor('d') + pd.DateOffset(hours=23, minutes=59)
df['day'] = df['time_1'].dt.day

print (df)
   subject_id     time_1              time_2  val  iid  day
0           1 2173-04-11 2173-04-12 23:59:00    5   12   11
1           1 2173-04-12 2173-04-13 23:59:00    5   12   12
2           2 2173-04-11 2173-04-13 23:59:00   40   12   11
3           2 2173-04-12 2173-04-21 23:59:00   40   12   12

Upvotes: 1

Related Questions