Reputation: 25997
I have a dataframe created like this:
import pandas as pd
df = pd.DataFrame({
'ind1': list('AAABBBCCC'),
'ind2': list(map(int, list('123123123'))),
'val1': [0, 0, 0, -1, -4, 5, 10, 11, 4],
'val2': [0.1, 0.2, -0.2, 0.1, 0.2, 0.2, -0.1, 2, 0.1]
})
df = df.set_index(['ind1', 'ind2'])
Resulting data:
val1 val2
ind1 ind2
A 1 0 0.1
2 0 0.2
3 0 -0.2
B 1 -1 0.1
2 -4 0.2
3 5 0.2
C 1 10 -0.1
2 11 2.0
3 4 0.1
I want to select all entries where:
val1
is unequal 0
val2
is < 0.5
In the example above, therefore only
B 1 -1 0.1
2 -4 0.2
3 5 0.2
should remain.
I cannot use sum()
as the values can be positive and negative, so something like this
df.reset_index().groupby('ind1').sum()
ind2 val1 val2
ind1
A 6 0 0.1
B 6 0 0.5
C 6 25 2.0
would not work.
How would I use any()
and all()
here?
Upvotes: 1
Views: 55
Reputation: 150735
This works for me, similar to Wen's solution:
mask = df.abs().groupby(level=0).transform('max')
df[mask.val1.gt(0)&mask.val2.lt(0.5)]
Output:
val1 val2
ind1 ind2
B 1 -1 0.1
2 -4 0.2
3 5 0.2
Upvotes: 1
Reputation: 323226
Without lambda by transform
s1=df.val1.ne(0).groupby(level=0).transform('any')
s2=df.val2.abs().lt(0.5).groupby(level=0).transform('all')
df[s1&s2]
Out[583]:
val1 val2
ind1 ind2
B 1 -1 0.1
2 -4 0.2
3 5 0.2
Upvotes: 2
Reputation: 11105
One way is via groupby().filter()
(link to docs), which evaluates a boolean condition on each group (as opposed to each DataFrame row):
df.groupby('ind1').filter(lambda x: x['val1'].any() &
(x['val2'].abs() < 0.5).all())
val1 val2
ind1 ind2
B 1 -1 0.1
2 -4 0.2
3 5 0.2
Note that DataFrame.groupby.filter()
is unrelated to the same-named method DataFrame.filter()
!
Upvotes: 1