Reputation: 1512
I need to calculate the difference between two rows groupwise using pandas.
| Group | Value | ID |
----------------------
| M1 | 10 | F1 |
----------------------
| M1 | 11 | F2 |
----------------------
| M1 | 12 | F3 |
----------------------
| M1 | 15 | F4 |
----------------------
Example output:
----------------------
| M1 | F3 - F2 | 1 |
----------------------
| M1 | F4 - F1 | 5 |
To calculate the sum I would use pandas.groupby('Group').sum(), but how do you calculate the difference between rows where the row ordering is important?
Upvotes: 1
Views: 212
Reputation: 862581
I think you need custom function with apply which return DataFrame
for each group, for select by position is used iat
:
def f(x):
#print (x)
a = x['Value'].iat[2] - x['Value'].iat[1]
b = x['Value'].iat[3] - x['Value'].iat[0]
c = x['ID'].iat[2] + ' - ' + x['ID'].iat[1]
d = x['ID'].iat[3] + ' - ' + x['ID'].iat[0]
return pd.DataFrame({'Value': [a,b], 'ID':[c,d]})
df = df.groupby('Group').apply(f).reset_index(level=1, drop=True).reset_index()
print (df)
Group ID Value
0 M1 F3 - F2 1
1 M1 F4 - F1 5
Upvotes: 1