Reputation: 7089
I have a problem that I think involves iterating over individual rows in groups in pandas. If anyone has a vectorized solution though, it would be great. Basically I'm trying to calculate the amount of days a group is in a particular stage:
Group stage time
A 1 2000-01-01
A 1 2000-01-10
A 2 2000-01-25
A 2 2000-02-04
A 2 2000-02-20
B 1 2000-01-05
B 1 2000-02-13
C 3 2000-04-01
Would become:
Group stage time stage duration
A 1 2000-01-01 0
A 1 2000-01-10 9
A 2 2000-01-25 0
A 2 2000-02-04 10
A 2 2000-02-20 26
B 1 2000-01-05 0
B 1 2000-01-13 8
C 3 2000-04-01 0
Edit 1: Thanks to Jeff and DSM, this worked perfectly:
df.groupby(["Group", "stage"])["time"].apply(lambda x: x-x.iloc[0])
Upvotes: 0
Views: 176
Reputation: 7089
Thanks to Jeff and DSM, this worked perfectly:
df.groupby(["Group", "stage"])["time"].apply(lambda x: x-x.iloc[0])
Upvotes: 1
Reputation: 4951
you can get days amount between two dates with datetime.strptime
method.
for example:
>>> from datetime import datetime
>>> date1 = "2000-01-25"
>>> date2 = "2000-02-04"
>>> delta = (datetime.strptime(date2,"%Y-%m-%d") - datetime.strptime(date1,"%Y-%m-%d"))
>>> delta.days
10
Upvotes: 0