Pandas - Groupby + Shift not working as expected

Question

I have a df that I'm trying to perform a groupby and shift on. However, the output isn't what I want.

I want to shift the "next" DueDate to the previous dates. So if the current DueDate is 1/1, and the next DueDate is 6/30, then insert a new column where the NextDueDate is 6/30 for all rows where DueDate==1/1. Then when the current DueDate is 6/30, then insert the next DueDate for all rows where DueDate==6/30.

Original df
ID Document Date  DueDate
1  ABC      1/31  1/1  
1  ABC      2/28  1/1  
1  ABC      3/31  1/1  
1  ABC      4/30  6/30 
1  ABC      5/31  6/30 
1  ABC      6/30  7/31 
1  ABC      7/31  7/31 
1  ABC      8/31  9/30

Desired output df
ID Document Date  DueDate NextDueDate
1  ABC      1/31  1/1     6/30
1  ABC      2/28  1/1     6/30
1  ABC      3/31  1/1     6/30
1  ABC      4/30  6/30    7/31
1  ABC      5/31  6/30    7/31
1  ABC      6/30  7/31    9/30
1  ABC      7/31  7/31    9/30
1  ABC      8/31  9/30    10/31

I've many variations along the lines of df['NextDueDate'] = df.groupby(['ID','Document'])['DueDate'].shift(-1) but it doesn't quite get me where I want.

cs95 · Accepted Answer

Define a function f to perform replacement based on shifted dates -

def f(x):
     i = x.drop_duplicates()
     j = i.shift(-1).fillna('10/30')

     return x.map(dict(zip(i, j)))

Now, call this function inside a groupby + apply on ID and Document -

df['NextDueDate'] = df.groupby(['ID', 'Document']).DueDate.apply(f)
df

   ID Document  Date DueDate NextDueDate
0   1      ABC  1/31     1/1        6/30
1   1      ABC  2/28     1/1        6/30
2   1      ABC  3/31     1/1        6/30
3   1      ABC  4/30    6/30        7/31
4   1      ABC  5/31    6/30        7/31
5   1      ABC  6/30    7/31        9/30
6   1      ABC  7/31    7/31        9/30
7   1      ABC  8/31    9/30       10/30

Pandas - Groupby + Shift not working as expected

Answers (2)

Related Questions