Comparing groups of rows in Pandas Dataframe that share a column value

Question

I have a Dataframe containing a name column and and a numerical column.

0 name number
1 a    3
2 a    6
3 b    2
4 b    500
5 c    349
6 c    200

What I want to do is define a new column that shows the numerical difference between two sequential rows that have the same name. This can be outputted onto the first row of the comparison I.e.:

0 name number diff
1  a    3     3
2  a    6     N/A
3  b    2     498
4  b    500   N/A
5  c    349   149
6  c    200   N/A

What is the best way of doing this using Pandas? As a bonus, if there is more than two rows with the same name, I want the highest numbered row compared against the lowest value row only.

jezrael · Accepted Answer

Use groupby with diff and abs:

df['diff'] = df.groupby('name')['number'].diff().abs()
print (df)
   0 name  number   diff
0  1    a       3    NaN
1  2    a       6    3.0
2  3    b       2    NaN
3  4    b     500  498.0
4  5    c     349    NaN
5  6    c     200  149.0

Or:

df['diff'] = df.groupby('name')['number'].diff(-1).abs()
print (df)
   0 name  number   diff
0  1    a       3    3.0
1  2    a       6    NaN
2  3    b       2  498.0
3  4    b     500    NaN
4  5    c     349  149.0
5  6    c     200    NaN

For your another question need transform:

print (df)
   0 name  number
0  1    a       3
1  2    a       6
2  2    a       8
3  3    b       2
4  4    b     500
5  4    b     600
6  5    c     349
7  6    c     200

df['diff'] = df.groupby('name')['number'].transform(lambda x: x.max() - x.min())
print (df)
   0 name  number  diff
0  1    a       3     5
1  2    a       6     5
2  2    a       8     5
3  3    b       2   598
4  4    b     500   598
5  4    b     600   598
6  5    c     349   149
7  6    c     200   149

Or:

df = df.groupby('name')['number'].apply(lambda x: x.max() - x.min()).reset_index()
print (df)
  name  number
0    a       5
1    b     598
2    c     149

Comparing groups of rows in Pandas Dataframe that share a column value

Answers (1)

Related Questions