cosmia1
cosmia1

Reputation: 129

Find all groups in which specific values show up

I'm new to Python and Pandas. I have the following DataFrame:

import pandas as pd

df = pd.DataFrame( {'a':['A','A','B','B','B','C','C','C'], 'b':[1,3,1,2,3,1,3,3]})

    a   b
0   A   1
1   A   3
2   B   1
3   B   2
4   B   3
5   C   1
6   C   3
7   C   3

I would like to create a new DataFrame in which only groups from column A that have the values 1 and 2 in column b show up, that is:

    a   b
0   B   1
1   B   2
2   B   3

I know we can create groups using df.groupby('a'), and the method df.all() seems to be related to this, but I can't figure it out by myself. It seems like it should be straightforward. Any help?

Upvotes: 0

Views: 116

Answers (2)

ansev
ansev

Reputation: 30930

Use GroupBy.filter + Series.any:

new_df=df.groupby('a').filter(lambda x: x.b.eq(2).any() & x.b.eq(1).any())
print(new_df)

   a  b
2  B  1
3  B  2
4  B  3

We could also use:

new_df=df[df.groupby('a').transform(lambda x: x.eq(1).any() & x.eq(2).any()).b]
print(new_df)

   a  b
2  B  1
3  B  2
4  B  3

Upvotes: 2

Quang Hoang
Quang Hoang

Reputation: 150785

Another approach:

s = (pd.DataFrame(df['b'].values == np.array([[1],[2]])).T
       .groupby(df['a'])
       .transform('any')
       .all(1)
    )

df[s]

Output:

    a   b
2   B   1
3   B   2
4   B   3

Upvotes: 1

Related Questions