How to find difference between dates per each id in python?

Question

I have a pandas dataframe with a format like this:

student_id     subject_id   subject_date  
100             2000        2010-01-01
100             2001        2010-03-05
100             2002        2012-05-25
101             2000        2009-01-10
101             2001        2016-08-16
102             2000        2008-05-05
102             2003        2008-05-20
102             2004        2009-01-03
102             2005        2010-02-13

The dataframe is already ordered by student_id and subject_date. The goal is to get the difference between subject_date for each student_id. For each student_id, it is guaranteed that there will be a minimum of 2 distinct subject_id. The resulting dataframe will look something like this:

student_id     subject_id   subject_date  diff_in_dates  
100             2000        2010-01-01    NA
100             2001        2010-03-05    30
100             2002        2012-05-25    60
101             2000        2009-01-10    NA
101             2001        2016-08-16    3000
102             2000        2008-05-05    NA
102             2003        2008-05-20    15
102             2004        2009-01-03    180
102             2005        2010-02-13    370

diff_in_dates values are just an approximation here and not the actual difference.

MaxU - stand with Ukraine · Accepted Answer

IIUC:

In [362]: df['diff_in_dates '] = df.groupby('student_id')['subject_date'].diff().dt.days

In [363]: df
Out[363]:
   student_id  subject_id subject_date  diff_in_dates
0         100        2000   2010-01-01             NaN
1         100        2001   2010-03-05            63.0
2         100        2002   2012-05-25           812.0
3         101        2000   2009-01-10             NaN
4         101        2001   2016-08-16          2775.0
5         102        2000   2008-05-05             NaN
6         102        2003   2008-05-20            15.0
7         102        2004   2009-01-03           228.0
8         102        2005   2010-02-13           406.0

How to find difference between dates per each id in python?

Answers (2)

Related Questions