qwerty
qwerty

Reputation: 887

How to aggregate multiple columns - Pandas

I have this df:

ID         Date  XXX  123_Var  456_Var  789_Var  123_P  456_P  789_P
 A  07/16/2019     1      987      551      313     22     12     94
 A  07/16/2019     9      135      748      403     92     40     41
 A  07/18/2019     8      376      938      825     14     69     96
 A  07/18/2019     5      259      176      674     52     75     72
 B   07/16/2019    9      690      304      948     56     14     78
 B   07/16/2019    8      819      185      699     33     81     83
 B   07/18/2019    1      580      210      847     51     64     87

I want to group the df by ID and Date, aggregate the XXX column by the maximum value, and aggregate 123_Var, 456_Var, 789_Var columns by the minimum value.

* Note: The df contains many of these columns. The shape is: {some int}_Var.

This is the current code I've started to write:

df = (df.groupby(['ID','Date'], as_index=False)
        .agg({'XXX':'max', list(df.filter(regex='_Var')): 'min'}))

Expected result:

ID         Date  XXX  123_Var  456_Var  789_Var
 A  07/16/2019     9      135      551      313
 A  07/18/2019     8      259      176      674
 B   07/16/2019    9      690      185      699
 B   07/18/2019    1      580      210      847

Upvotes: 1

Views: 42

Answers (1)

jezrael
jezrael

Reputation: 862711

Create dictionary dynamic with dict.fromkeys and then merge it with {'XXX':'max'} dict and pass to GroupBy.agg:

d = dict.fromkeys(df.filter(regex='_Var').columns, 'min')
df = df.groupby(['ID','Date'], as_index=False).agg({**{'XXX':'max'}, **d})
print (df)
  ID        Date  XXX  123_Var  456_Var  789_Var
0  A  07/16/2019    9      135      551      313
1  A  07/18/2019    8      259      176      674
2  B  07/16/2019    9      690      185      699
3  B  07/18/2019    1      580      210      847

Upvotes: 2

Related Questions