How to subtract every time from the start time

Question

I have a dataframe "user1" with around 100000 rows of which some of the entries are as follows :

    user                    timestamp        x          y          z       class   
0      0x11      2018-04-10 12:37:51.252     -0.86      6.51        1.28        walk
1      0x11      2018-04-10 12:37:51.252     -3.16      9.52        2.98        walk
2      0x11      2018-04-10 12:37:51.324     -3.31      8.94        2.24        walk

Instead of full time, I want the difference between current time and the start time i.e suppose in 3rd row the timestamp is 2018-04-10 12:37:51.252. I would like to change it to (2018-04-10 12:37:51.252 - 2018-04-10 12:37:51.324 = 0.72 secs). How to do this with pandas. All the other column entries should remain as it is.

Umar.H · Accepted Answer

ensuring your datetime column is a datetime is the first step,

import numpy as np 
import pandas as pd
df['timestamp'] = pd.to_datetime(df['timestamp'])

next, we grab the earliest timestamp record per user with transform

the benefit of transform is that it allows one to use groupby methods without modifying the index, very handy.

ts = df.groupby(['user'])['timestamp'].transform('min')

then return the delta as seconds to match your target output.

df['timedelta'] = (df['timestamp'] - ts) / np.timedelta64(1,'s')

print(df)

   user               timestamp     x     y     z class  timedelta
0  0x11 2018-04-10 12:37:51.252 -0.86  6.51  1.28  walk      0.000
1  0x11 2018-04-10 12:37:51.252 -3.16  9.52  2.98  walk      0.000
2  0x11 2018-04-10 12:37:51.324 -3.31  8.94  2.24  walk      0.072

How to subtract every time from the start time

Answers (2)

Related Questions