Luke
Luke

Reputation: 7089

Iterating within groups in Pandas

I have a problem that I think involves iterating over individual rows in groups in pandas. If anyone has a vectorized solution though, it would be great. Basically I'm trying to calculate the amount of days a group is in a particular stage:

Group    stage    time
  A        1      2000-01-01
  A        1      2000-01-10
  A        2      2000-01-25
  A        2      2000-02-04
  A        2      2000-02-20
  B        1      2000-01-05
  B        1      2000-02-13
  C        3      2000-04-01

Would become:

Group    stage    time           stage duration
  A        1      2000-01-01           0
  A        1      2000-01-10           9
  A        2      2000-01-25           0
  A        2      2000-02-04           10
  A        2      2000-02-20           26
  B        1      2000-01-05           0
  B        1      2000-01-13           8
  C        3      2000-04-01           0

Edit 1: Thanks to Jeff and DSM, this worked perfectly:

df.groupby(["Group", "stage"])["time"].apply(lambda x: x-x.iloc[0])

Upvotes: 0

Views: 176

Answers (2)

Luke
Luke

Reputation: 7089

Thanks to Jeff and DSM, this worked perfectly:

df.groupby(["Group", "stage"])["time"].apply(lambda x: x-x.iloc[0])

Upvotes: 1

Elisha
Elisha

Reputation: 4951

you can get days amount between two dates with datetime.strptime method.

for example:

>>> from datetime import datetime

>>> date1 = "2000-01-25"
>>> date2 = "2000-02-04"
>>> delta = (datetime.strptime(date2,"%Y-%m-%d") - datetime.strptime(date1,"%Y-%m-%d"))
>>> delta.days
10

Upvotes: 0

Related Questions