Reputation: 57
I need to remove all tokens where the number of rows in the group are less than 3.
Dataframe:-
token | active | |
---|---|---|
0 | 58 | 1 |
1 | 58 | 8 |
2 | 63 | 5 |
3 | 63 | 9 |
4 | 63 | 0 |
5 | 97 | 6 |
6 | 97 | 1 |
I filtered groups having less than 3 rows, but how to remove those groups from main dataframe?
c_df = df.groupby('token').agg('count')
sc_df = c_df.loc[c_df['active'] < 3]
print(sc_df)
Current result:-
token | active |
---|---|
58 | 2 |
97 | 2 |
Upvotes: 3
Views: 1094
Reputation: 323226
Let us try filter
out = df.groupby('token').filter(lambda x : len(x)>=3)
Out[24]:
token active
2 63 5
3 63 9
4 63 0
Upvotes: 2
Reputation: 260455
You can use slicing:
df = df[df.groupby('token')['active'].transform('count').ge(3)]
output:
token active
2 63 5
3 63 9
4 63 0
Upvotes: 3