Take max from multiple grouped data pandas

Question

I am in a loop which gives me a groupby output like below df.groupby(['grp1','grp2'])['mycol'].sum()

Basically I am getting sum of my grouped elements.

grp1  grp2 
A     1    10 
B     1    20
C     2    30 
D     3    40 
E     4    50 
      1    60

Now in the next iteration I may get a grouped df like below

grp1  grp2 
A     1    20 
D     3    40 
E     4    30 
      1    90 
F     1    40

I want to take the max from each iteration. So after 2nd iteration I have an output like

grp1  grp2 
A     1    20 #because 20 was higher than 10
B     1    20 #carried as it is
C     2    30 #carried as it is
D     3    40 #carried as it is (both were equal)
E     4    30 #because 90+30 >50+60
      1    90 
F     1    40 #added

So by the end I have which group reached peak values during say 5 iterations. It sounds straight forward ( to keep track of max seen till now), but I am not getting how to approach this. I tried doing df.groupby(['grp1','grp2'])['mycol'].sum().to_dict() and do something like updating dict on reading new df. (just a try, not sure how to keep dict updated) or maybe there is a simple pandas and np solution I still dont know.

Take max from multiple grouped data pandas

Answers (1)

Related Questions