MrStewart
MrStewart

Reputation: 81

Python - Filtering based on top 5 appearing categorical variable

I'm trying to create a new data frame by filtering out the rows with the top 5 most appearing countries and saving it into a new dataframe.

I tried using .nlargest but it doesn't work for categorical data.

Thank you.

Example of data frame

Upvotes: 2

Views: 1175

Answers (1)

jezrael
jezrael

Reputation: 862511

Use Series.value_counts - it return sorted Series by counts, so for top values filter index values by indexing and pass to boolean indexing with Series.isin:

top5 = df['Country'].value_counts().index[:5]
df1 = df[df['Country'].isin(top5)]

Upvotes: 4

Related Questions