Moondra
Moondra

Reputation: 4511

using 'groupby.count' with agg

df.head

                Populous        Continents
Australia   2.331602e+07        Australia
Brazil      2.059153e+08        South America
Canada      3.523986e+07        North America
China      1.367645e+09         Asia
France     6.383735e+07         Europe

Above are the first 5 entries of my dataframe. I want to group them by Continents, then I want to perform some statistical analysis. I want to create a new dataframe with the Avg, Sum, STD of each Group's populous as well as the count of countries in each group, as its columns.

new_df =df.groupby('Continents')['Populous'].agg({ 'Avg': np.average, 'Sum':np.sum, 'STD': np.std}), takes care of three columns, but I don't know how to get count in there. I tried including 'Size': count , within the agg method, but it resulted in an error.

Thank you.

Upvotes: 1

Views: 1004

Answers (2)

piRSquared
piRSquared

Reputation: 294298

You might also find this useful:

df.groupby('Continents').Populous.describe().unstack()

Also see this answer if you want more stats.

Upvotes: 2

pansen
pansen

Reputation: 6663

You can use 'Size': len or 'Size': 'count' for this to work. However, as @DSM pointed out, len does count missing values whereas 'count' doesn't.

Upvotes: 1

Related Questions