Reputation: 3

Grouped by counts in Python

We have a df called allyears that has years, gender, names in it.

Something like this:

name	sex	number	year
John	M	1	2010
Jane	F	2	2011

I want to group the top10 names for a given year with their respective counts. I tried this code, but this is not returning what I am looking for.

males = allyears[(allyears.year>2009)&(allyears.sex=='M')]
maleNameCounts = pd.DataFrame(males.groupby(['year', 'name']).count())
maleNameCounts.sort_values('number', ascending=True)

How should I be approaching this problem?

Upvotes: 0

Answers (2)

SAL

Reputation: 632

You can try following:

males = allyears[(allyears.year>2009)&(allyears.sex=='M'),]
maleNameCounts = df.groupby(['Year', 'Name']).size().nlargest(10).reset_index().rename(columns={0:'count'})
maleNameCounts

Upvotes: 1

Larry the Llama

Reputation: 1100

Hope this helps:

Add a column with counts

df["name_count"] = df[name].map(df.name.value_counts())

Optional to remove duplicates

df = df.drop_duplicates(["name"])

Sort (by counts)

df = df.sort_values("name_count")

Note that this can all be tweaked were necessary.

Upvotes: 1

Grouped by counts in Python

Answers (2)

Related Questions