yaliga
yaliga

Reputation: 57

removing groups by group number of rows in pandas dataframe

I need to remove all tokens where the number of rows in the group are less than 3.

Dataframe:-

token active
0 58 1
1 58 8
2 63 5
3 63 9
4 63 0
5 97 6
6 97 1

I filtered groups having less than 3 rows, but how to remove those groups from main dataframe?

c_df = df.groupby('token').agg('count')
sc_df = c_df.loc[c_df['active'] < 3]
print(sc_df)

Current result:-

token active
58 2
97 2

Upvotes: 3

Views: 1094

Answers (2)

BENY
BENY

Reputation: 323226

Let us try filter

out = df.groupby('token').filter(lambda x : len(x)>=3)
Out[24]: 
   token  active
2     63       5
3     63       9
4     63       0

Upvotes: 2

mozway
mozway

Reputation: 260455

You can use slicing:

df = df[df.groupby('token')['active'].transform('count').ge(3)]

output:

   token  active
2     63       5
3     63       9
4     63       0

Upvotes: 3

Related Questions