Different column aggregations in python dataframe

Question

How to get a row at last of dataframe which is column aggregations. But, it is sum for few and max for other?

import pandas as pd

data = [
    ["id_1",6,7,9],
    ["id_2",9,7,1],
    ["id_3",6,7,10],
    ["id_4",9,5,10]
]
df = pd.DataFrame(data, columns = ['Student Id', 'Math', 'Physical', 'Flag'])

     Student Id  Math  Physical  Flag
0       id_1     6         7          9
1       id_2     9         7          1
2       id_3     6         7         10
3       id_4     9         5         10

I want to get a row at last as total which gives me column sum for Math & Physical and max for Flag. Excluding student_id column

     Student Id  Math  Physical  Chemistry
0       id_1     6         7          9
1       id_2     9         7          1
2       id_3     6         7         10
3       id_4     9         5         10
       (Total)   30        26         10

I can do the sum of all by below but how to get different aggregations for different columns and also exclude NAN?

df.loc['(TOTAL)'] = df[['Math', 'Physical' ]].sum(axis = 0)

jezrael · Accepted Answer

Use DataFrame.agg for specify aggregations, for new total row is used DataFrame.set_index:

df = df.set_index('Student Id')
df.loc['(TOTAL)'] = df.agg({'Math':'sum', 'Physical':'sum', 'Flag':'max'})
df = df.reset_index()
print (df)
  Student Id  Math  Physical  Flag
0       id_1     6         7     9
1       id_2     9         7     1
2       id_3     6         7    10
3       id_4     9         5    10
4    (TOTAL)    30        26    10

Or replaced NaN only for specified column:

df.loc[len(df)] = df.agg({'Math':'sum', 'Physical':'sum', 'Flag':'max'})
df = df.fillna({'Student Id':'(TOTAL)'})

print (df)
  Student Id  Math  Physical  Flag
0       id_1   6.0       7.0   9.0
1       id_2   9.0       7.0   1.0
2       id_3   6.0       7.0  10.0
3       id_4   9.0       5.0  10.0
4    (TOTAL)  30.0      26.0  10.0

Different column aggregations in python dataframe

Answers (1)

Related Questions