Reputation: 79
I have a dataframe-column with random timestamps and NaT values in between them:
timestamp
01-01-2018 13:12:48
NaT
NaT
NaT
04-01-2018 08:15:12
NaT
Nat
I want to create another column that starts counting from 0 once there is a new timestamp in the timestamp column (col_A). I wouldnt mind if that column was a timestamp object but without the date (col_B). Is that possible?
timestamp col_A col_B
01-01-2018 13:12:48 0 00:00:00
NaT 1 00:01:00
NaT 2 00:02:00
NaT 3 00:03:00
04-01-2018 08:15:12 0 00:00:00
NaT 1 00:01:00
Nat 2 00:02:00
Upvotes: 1
Views: 43
Reputation: 93151
It's an island-and-gap problem: every time timestamp
is not null, it creates a new island. You usually solve these problems with a cumulative sum of some kind.
Try this:
islands = df['timestamp'].notnull().cumsum()
df['col_A'] = df.groupby(islands).cumcount()
df['col_B'] = pd.to_timedelta(df['col_A'], unit='minute')
Upvotes: 1