Find all groups in which specific values show up

Question

I'm new to Python and Pandas. I have the following DataFrame:

import pandas as pd

df = pd.DataFrame( {'a':['A','A','B','B','B','C','C','C'], 'b':[1,3,1,2,3,1,3,3]})

    a   b
0   A   1
1   A   3
2   B   1
3   B   2
4   B   3
5   C   1
6   C   3
7   C   3

I would like to create a new DataFrame in which only groups from column A that have the values 1 and 2 in column b show up, that is:

I know we can create groups using df.groupby('a'), and the method df.all() seems to be related to this, but I can't figure it out by myself. It seems like it should be straightforward. Any help?

ansev · Accepted Answer

Use GroupBy.filter + Series.any:

new_df=df.groupby('a').filter(lambda x: x.b.eq(2).any() & x.b.eq(1).any())
print(new_df)

   a  b
2  B  1
3  B  2
4  B  3

We could also use:

new_df=df[df.groupby('a').transform(lambda x: x.eq(1).any() & x.eq(2).any()).b]
print(new_df)

   a  b
2  B  1
3  B  2
4  B  3

Find all groups in which specific values show up

Answers (2)

Related Questions