Reputation: 2332
I want to filter a dataframe according to whether any of several columns in a list match a test.
E.g., it can work in this way:
ddf = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
ddf[(ddf['A']==0)|(ddf['B']==0)|(ddf['C']==0)|(ddf['D']==0)]
...and one could build a loop if there are many more columns to process. But I wonder whether there's a more pythonic way to proceed, starting from the result of
ddf[list('ABCD')]==0
which gives 4 columns of booleans, over which I'd like to apply a or
operation by row.
Upvotes: 1
Views: 94
Reputation: 46898
If it is the same test like whether it is zero, then you use any()
across the rows:
np.random.seed(999)
ddf = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
ddf[(ddf[['A','B','C','D']]==0).any(axis=1)]
A B C D
71 52 13 0 50
93 51 0 60 71
Upvotes: 2