Reputation: 630
I have a data frame like this:
import pandas as pd
data = [['bob', 1], ['james', 4], ['joe', 4], ['joe', 1], ['bob', 3], ['wendy', 5], ['joe', 7]]
df = pd.DataFrame(data, columns=['name', 'score'])
print(df)
Looking like:
name score
0 bob 1
1 james 4
2 joe 4
3 joe 1
4 bob 3
5 wendy 5
6 joe 7
I would like to drop all persons with only a single occurrence in a Pythonic way i.e. the result should look like:
name score
0 bob 1
2 joe 4
3 joe 1
4 bob 3
6 joe 7
... and how would I do the same with entries that only have 1 or 2 occurrences? i.e.
name score
2 joe 4
3 joe 1
6 joe 7
Upvotes: 1
Views: 237
Reputation: 8302
try this, DataFrameGroupBy.nunique
to get count of unique elements in each group & apply isin
to filter occurrences.
g = df.groupby(['name'])['score'].transform('nunique')
df[~g.isin([1])]
name score
0 bob 1
2 joe 4
3 joe 1
4 bob 3
6 joe 7
df[~g.isin([1,2])]
name score
2 joe 4
3 joe 1
6 joe 7
Upvotes: 2