thepez87
thepez87

Reputation: 221

How to pandas groupby one column and filter dataframe based on the minimum unique values of another column?

I have a data frame that looks like this:

CP   AID   type
1    1      b
1    2      b
1    3      a
2    4      a
2    4      b
3    5      b
3    6      a
3    7      b

I would like to groupby the CP column and filter so it only returns rows where the CP has at least 3 unique 'pairs' from the AID column.

The result should look like this:

CP   AID   type
1    1      b
1    2      b
1    3      a
3    5      b
3    6      a
3    7      b

Upvotes: 1

Views: 212

Answers (2)

Erfan
Erfan

Reputation: 42916

You can groupby in combination with unique:

m = df.groupby('CP').AID.transform('unique').str.len() >= 3

print(df[m])
   CP  AID type
0   1    1    b
1   1    2    b
2   1    3    a
5   3    5    b
6   3    6    a
7   3    7    b

Or as RafaelC mentioned in the comments:

m = df.groupby('CP').AID.transform('nunique').ge(3)

print(df[m])
   CP  AID type
0   1    1    b
1   1    2    b
2   1    3    a
5   3    5    b
6   3    6    a
7   3    7    b

Upvotes: 3

RenauV
RenauV

Reputation: 383

You can do that:

count = df1[['CP', 'AID']].groupby('CP').count().reset_index()
df1 = df1[df1['CP'].isin(count.loc[count['AID'] == 3,'CP'].values.tolist())]

Upvotes: 0

Related Questions