Adestin
Adestin

Reputation: 153

Pandas groupby agg with multiple functions to apply returns error

Problem Attempting a groupby on a simple dataframe (downloadable csv) and then agg to return aggregate values for columns (size, sum, mean, std deviation). What seems like a simple problem is giving an unexpectedly challenging error.

Top15.groupby('Continent')['Pop Est'].agg(np.mean, np.std...etc)
# returns 
ValueError: No axis named <function std at 0x7f16841512f0> for object type <class 'pandas.core.series.Series'>

What I am trying to get is a df with index set to continents and columns ['size', 'sum', 'mean', 'std']

Example Code

import pandas as pd
import numpy as np

# Create df
df = pd.DataFrame({'Country':['Australia','China','America','Germany'],'Pop Est':['123','234','345','456'],'Continent':['Asia','Asia','North America','Europe']})

# group and agg
df = df.groupby('Continent')['Pop Est'].agg('size','sum','np.mean','np.std')

Upvotes: 2

Views: 4153

Answers (1)

sparrow
sparrow

Reputation: 11460

You can only aggregate size and sum on numeric values so when you create your dataframe don't input your numbers as stings:

df = pd.DataFrame({'Country':['Australia','China','America','Germany'],'PopEst':[123,234,345,456],'Continent':['Asia','Asia','North America','Europe']})

I think that this will get you what you want?

grouped = df.groupby('Continent')
grouped['PopEst'].agg(['size','sum','mean','std'])


size    sum mean    std
Continent               
Asia    2   357 178.5   78.488853
Europe  1   456 456.0   NaN
North America   1   345 345.0   NaN

Upvotes: 5

Related Questions