LeoBloom
LeoBloom

Reputation: 11

Pandas --Groupby Multiple Columns Return Last Value

Similar questions have been asked but cannot find my exact case (ideally without loop). I have

df  
    A  B  C 
    1 30 101
    1 31 220
    1 32 310
    2 30 400
    2 31 555
    2 32 616
    3 30 777
    3 31 703
    3 32 844

I want to create 'D' where groupby 'A' and 'Last' of 'B' Returns value of 'C':

A  B  C  D
1 30 101 310
1 31 220 310
1 32 310 310
2 30 400 616
2 31 555 616
2 32 616 616
3 30 777 844
3 31 703 844
3 32 844 844

I tried

df['D'] = df.groupby(['A', 'B']).agg({'C': ['last']})

but get

TypeError: incompatible index of inserted column with frame index

Then

df['D'] = df.groupby(['A', 'B']).agg({'C': ['last']}).reset_index(0,drop=True)

and get

ValueError: cannot reindex from a duplicate axis

Any help appreciated

Upvotes: 1

Views: 455

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150775

You can make do with:

df['D'] = df.sort_values('B').groupby('A')['C'].transform('last')

Output:

   A   B    C    D
0  1  30  101  310
1  1  31  220  310
2  1  32  310  310
3  2  30  400  616
4  2  31  555  616
5  2  32  616  616
6  3  30  777  844
7  3  31  703  844
8  3  32  844  844

Upvotes: 1

Related Questions