Reputation: 793
df = pd.DataFrame({'A' : ['bar', 'bar', 'bar', 'foo',
'foo', 'foo'],
'B' : [1, 2, 3, 4, 5, 6],
'C' : [2.0, 5., 8., 1., 2., 9.]})
>>> df
A B C
0 bar 1 2.0
1 bar 2 5.0
2 bar 3 8.0
3 foo 4 1.0
4 foo 5 2.0
5 foo 6 9.0
How can I get the groups with both neededVals = [1.0,2.0]
in C if I groupby('A')
:
3 foo 4 1.0
4 foo 5 2.0
5 foo 6 9.0
And just those values as well:
3 foo 4 1.0
4 foo 5 2.0
Upvotes: 1
Views: 34
Reputation: 863226
I think need compare set
s with GroupBy.transform
and filter by boolean indexing
:
neededVals = [1.0,2.0]
df = df[df.groupby('A')['C'].transform(lambda x: set(x) >= set(neededVals))]
print (df)
A B C
3 foo 4 1.0
4 foo 5 2.0
5 foo 6 9.0
Detail:
print (df.groupby('A')['C'].transform(lambda x: set(x) >= set(neededVals)))
0 False
1 False
2 False
3 True
4 True
5 True
Name: C, dtype: bool
And for second first filter out unnecessary rows by isin
and then compare equality:
df = df[df['C'].isin(neededVals)]
df = df[df.groupby('A')['C'].transform(lambda x: set(x) == set(neededVals))]
print (df)
A B C
3 foo 4 1.0
4 foo 5 2.0
Upvotes: 1