Reputation: 81
I'm trying to create a new data frame by filtering out the rows with the top 5 most appearing countries and saving it into a new dataframe.
I tried using .nlargest but it doesn't work for categorical data.
Thank you.
Upvotes: 2
Views: 1175
Reputation: 862511
Use Series.value_counts
- it return sorted Series
by counts, so for top values filter index
values by indexing and pass to boolean indexing
with Series.isin
:
top5 = df['Country'].value_counts().index[:5]
df1 = df[df['Country'].isin(top5)]
Upvotes: 4