Pandas merge on subgroups of dataframe

Question

Consider the dataframe df:

   SCHOOL  ID  SET  NAME
0  Foo     1   0    Ben
1  Foo     2   0    Ben
2  Foo     3   1    Chris
3  Foo     4   1    Joe
4  Foo     5   0    Tom
5  Bar     1   0    Harry
6  Bar     2   1    Jeff
7  Bar     3   0    George
8  Bar     4   1    Tom

I want to create the new dataframe

   SCHOOL  ID  SET  NAME    PREV_NAME
0  Foo     1   0    Ben     None
1  Foo     2   0    Ben     Ben
2  Foo     3   1    Chris   None
3  Foo     4   1    Joe     Chris
4  Foo     5   0    Tom     Ben
5  Bar     1   0    Harry   None
6  Bar     2   1    Jeff    None
7  Bar     3   0    George  Harry
8  Bar     4   1    Tom     Jeff

where PREV_NAME is the previous NAME (sorted by ID) in the same SCHOOL and SET as a given record.

If there were no SET condition, then I could do something like

df_copy = df.copy()
df_copy['ID'] = df_copy['ID'] + 1
df.merge(df_copy, on=['SCHOOL', 'ID'], how='left')

As there is a SET condition, we need a way of grouping each unique (SCHOOL, SET) pair and then applying the above merge (noting that we can no longer use the ID) with the new index.

How would I achieve this using Pandas?

Pandas merge on subgroups of dataframe

Answers (1)

Related Questions