Ajax
Ajax

Reputation: 1512

calculate row difference groupwise in pandas

I need to calculate the difference between two rows groupwise using pandas.

| Group | Value | ID |
----------------------
|  M1   | 10    | F1 |
----------------------
|  M1   | 11    | F2 |
----------------------
|  M1   | 12    | F3 |
----------------------
|  M1   | 15    | F4 |
----------------------

Example output:

----------------------
|  M1   | F3 - F2 | 1 |
----------------------
|  M1   | F4 - F1 | 5 |

To calculate the sum I would use pandas.groupby('Group').sum(), but how do you calculate the difference between rows where the row ordering is important?

Upvotes: 1

Views: 212

Answers (1)

jezrael
jezrael

Reputation: 862581

I think you need custom function with apply which return DataFrame for each group, for select by position is used iat:

def f(x):
    #print (x)
    a = x['Value'].iat[2] - x['Value'].iat[1]
    b = x['Value'].iat[3] - x['Value'].iat[0]
    c = x['ID'].iat[2] + ' - ' + x['ID'].iat[1]
    d = x['ID'].iat[3] + ' - ' + x['ID'].iat[0]
    return pd.DataFrame({'Value': [a,b], 'ID':[c,d]})

df = df.groupby('Group').apply(f).reset_index(level=1, drop=True).reset_index()
print (df)

  Group       ID  Value
0    M1  F3 - F2      1
1    M1  F4 - F1      5

Upvotes: 1

Related Questions