Reputation: 2839
If I have a dataframe df with "1" to "x" columns and "y" number of rows. How do I drop any row where one or more column values are outside a conditional statement like greater than or less than:
I have tried this for 2 columns named "1" and "2":
df = df[df[["1", "2"]] < 0.02]
but this is giving me the same number of rows but NaN's in the column values where there used to be values.
Upvotes: 2
Views: 2303
Reputation: 10060
How about?
import pandas
import numpy
randn = numpy.random.randn
>>> df = pandas.DataFrame(randn(4, 4), columns=['A', 'B', 'C', 'D'], index=['a', 'b', 'c', 'd'])
>>> df
A B C D
a -1.509065 -1.700310 -1.443745 0.659686
b 1.303247 0.466667 -0.320595 0.428322
c -0.126422 0.203114 -1.157571 -0.766103
d -0.611362 -0.653566 0.451102 0.617120
>>> df[~(df < 0.5).all(1)]
A B C D
a -1.509065 -1.700310 -1.443745 0.659686
b 1.303247 0.466667 -0.320595 0.428322
d -0.611362 -0.653566 0.451102 0.617120
>>> df[~(df > 1.3).any(1)]
A B C D
a -1.509065 -1.700310 -1.443745 0.659686
c -0.126422 0.203114 -1.157571 -0.766103
d -0.611362 -0.653566 0.451102 0.617120
Hope it helps
EDIT: even better solution based on azuric's comments
Upvotes: 2