Reputation: 384
I have a pandas dataframe containing rows with numbered columns:
1 2 3 4 5
a 0 0 0 0 1
b 1 1 2 1 9
c 2 2 2 2 2
d 5 5 5 5 5
e 8 9 9 9 9
How can I filter out the rows where a subset of columns are all above or below a certain value?
So, for example: I want to remove all rows where columns 1 to 3 all values are not > 3. In the above, that would leave me with only rows d and e.
The columns I am filtering and the value I am checking against are both arguments.
I've tried a few things, this is the closest I've gotten:
df[df[range(1,3)]>3]
Any ideas?
Upvotes: 5
Views: 2925
Reputation: 394469
You can achieve this without using apply
:
In [73]:
df[(df.ix[:,0:3] > 3).all(axis=1)]
Out[73]:
1 2 3 4 5
d 5 5 5 5 5
e 8 9 9 9 9
So this slices the df to just the first 3 columns using ix
and then we compare against the scalar 3
and then call all(axis=1)
to create a boolean series to mask the index
Upvotes: 1
Reputation: 294576
I used loc
and all
in this function:
def filt(df, cols, thresh):
return df.loc[(df[cols] > thresh).all(axis=1)]
filt(df, [1, 2, 3], 3)
1 2 3 4 5
d 5 5 5 5 5
e 8 9 9 9 9
Upvotes: 5