Reputation: 241
Here is a snippet of data-frame which looks like this (original data frame contains 8k rows):
User State change_datetime endstate
---------------------------------------------------------
100234 XIM 2016-01-19 17:03:12 Inactive
100234 Active 2016-01-28 17:17:15 XIM
100234 Active 2016-02-16 17:57:50 NaN
100234 Live 2018-03-04 06:30:01 Active
213421 XIM 2016-02-16 17:57:53 NaN
213421 Active 2018-02-01 10:01:51 XIM
213421 Active 2018-02-01 20:49:41 NaN
213421 Active 2018-02-13 20:40:11 NaN
213421 R 2018-03-04 05:38:51 Active
612312 B-98 2018-11-01 17:12:11 XIM
I'm trying to find out how long each unique User spends in an 'Active' state till they change into a different state other than 'Active'. Is it possible to create a loop to generate a list indicating unique User and the length of time the User spent as 'Active' before changing state?
To calculate the length of time, I found the difference between the first 'change_datetime' where the User is in an 'Active' state between the last 'change_datetime' value for when User was last seen as 'Active'.
For example: User '100234' spent: 19 days, 0 hours, 40 minutes and 35 seconds in 'Active' state
Thank you.
Upvotes: 0
Views: 172
Reputation: 22503
If you always have one session only, you only need two conditions:
df["change_datetime"] = pd.to_datetime(df["change_datetime"])
cond1 = df["State"].eq("Active")
cond2 = df["State"].shift(-1).ne("Active")
start = df.loc[cond1].groupby("User")["change_datetime"].first()
end = df.loc[cond1&cond2].groupby("User")["change_datetime"].first()
print (end-start)
User
100234 19 days 00:40:35
213421 12 days 10:38:20
Name: change_datetime, dtype: timedelta64[ns]
Upvotes: 2