Satish
Satish

Reputation: 3

Grouped by counts in Python

We have a df called allyears that has years, gender, names in it.

Something like this:

name sex number year
John M 1 2010
Jane F 2 2011

I want to group the top10 names for a given year with their respective counts. I tried this code, but this is not returning what I am looking for.

males = allyears[(allyears.year>2009)&(allyears.sex=='M')]
maleNameCounts = pd.DataFrame(males.groupby(['year', 'name']).count())
maleNameCounts.sort_values('number', ascending=True)

How should I be approaching this problem?

Upvotes: 0

Views: 65

Answers (2)

SAL
SAL

Reputation: 632

You can try following:

males = allyears[(allyears.year>2009)&(allyears.sex=='M'),]
maleNameCounts = df.groupby(['Year', 'Name']).size().nlargest(10).reset_index().rename(columns={0:'count'})
maleNameCounts

Upvotes: 1

Larry the Llama
Larry the Llama

Reputation: 1100

Hope this helps:

Add a column with counts

df["name_count"] = df[name].map(df.name.value_counts())

Optional to remove duplicates

df = df.drop_duplicates(["name"])

Sort (by counts)

df = df.sort_values("name_count")

Note that this can all be tweaked were necessary.

Upvotes: 1

Related Questions