Henrique Branco
Henrique Branco

Reputation: 1940

Count to first column and sum to the rest of the columns pandas groupby

I have a pandas DataFrame df with 290 columns.

Is there a way to make the .groupby operation concerning the following rules:

  1. sum operation for the 2st column.
  2. count operation to 3nd column.
  3. mean operation to all other columns

I know that I could use like this:

df.groupby("column1") \
    .agg({"column2":"sum", 
          "column3":"count",
          "column4":"mean",
          ...
          "column290":"mean"})

But using this way is totally unproductive, since I have to type all the other columns.

Is there a way to set this operation? Like setting a default function when I don't set any to agg?

Upvotes: 0

Views: 915

Answers (2)

Scott Boston
Scott Boston

Reputation: 153460

Let's use a dictionary:

import pandas as pd
import numpy as np

df=pd.DataFrame(np.arange(100).reshape(10,-1), columns=[*'ABCDEFGHIJ'])

# Defined the first three columns  
aggdict={'A':'sum',
         'B':'sum',
         'C':'count'}

# Use for loop to added to dictoary the rest of the columns. Creating a 
# default aggregation method
for i in df.columns[3:]:
    aggdict[i]='mean'

# Use agg with dictionary
df.groupby(df.index%2).agg(aggdict)

Upvotes: 1

Vons
Vons

Reputation: 3325

df1=df.groupby("column1").agg({"column2":"sum", "column3":"count"})

df2=df.drop(["column2", "column3"], 1).groupby("column1").agg("mean", 1)

df3=pd.concat([df1, df2], 1)

Upvotes: 0

Related Questions