comparing the last values in pandas groupby

Question

This is my dataframe:

df = pd.DataFrame({'a': list('xxxxxzzz'), 'b':[0,0,1,0,1,0,1,1], 'c': [100, 101, 105, 110, 120, 125, 100, 150], 'd':[0,0,0,1,1,0,0,0]})

I group them:

groups = df.groupby(['a', 'd'])

I want to add another column to df that in each group shows the difference (in percentage) between the last value of c that its b is 0 and the last value that its b is 1.

For example in the first group I want to compare c of row 2 and row 1.

My desired groups looks like this:

('x', 0)
   a  b    c  d   result
0  x  0  100  0     3.96
1  x  0  101  0     3.96
2  x  1  105  0     3.96
('x', 1)
   a  b    c  d   result
3  x  0  110  1     9.09
4  x  1  120  1     9.09
('z', 0)
   a  b    c  d   result
5  z  0  125  0     20.0
6  z  1  100  0     20.0
7  z  1  150  0     20.0

rafaelc · Accepted Answer

Define a custom function and use GroupBy.apply

def func(s):
    l0 = s[s.b==0].tail(1).c.item()
    l1 = s[s.b==1].tail(1).c.item()
    s['result'] = (l1 - l0)/l0 * 100
    return s

df.groupby(['a','d']).apply(func)

Outputs

    a   b   c   d   result
0   x   0   100 0   3.960396
1   x   0   101 0   3.960396
2   x   1   105 0   3.960396
3   x   0   110 1   9.090909
4   x   1   120 1   9.090909
5   z   0   125 0   20.000000
6   z   1   100 0   20.000000
7   z   1   150 0   20.000000

If you need each groups separately, just use a list comprehension [func(g) for n, g in df.groupby(['a','d'])]

comparing the last values in pandas groupby

Answers (2)

Related Questions