Reputation: 515
I have a dataframe with many columns (around 1000). Given a set of columns (around 10), which have 0 or 1 as values, I would like to select all the rows where I have 1s in the aforementioned set of columns.
Toy example. My dataframe is something like this:
c1,c2,c3,c4,c5
'a',1,1,0,1
'b',0,1,0,0
'c',0,0,1,1
'd',0,1,0,0
'e',1,0,0,1
And I would like to get the rows where the columns c2 and c5 are equal to 1:
'a',1,1,0,1
'e',1,0,0,1
Which would be the most efficient way to do it?
Thanks!
Upvotes: 0
Views: 55
Reputation: 16660
import pandas as pd
frame = pd.DataFrame([
['a',1,1,0,1],
['b',0,1,0,0],
['c',0,0,1,1],
['d',0,1,0,0],
['e',1,0,0,1]], columns='c1,c2,c3,c4,c5'.split(','))
print(frame.loc[(frame['c2'] == 1) & (frame['c5'] == 1)])
Upvotes: 0
Reputation: 76917
This would be more generic for multiple columns cols
In [1277]: cols = ['c2', 'c5']
In [1278]: df[(df[cols] == 1).all(1)]
Out[1278]:
c1 c2 c3 c4 c5
0 'a' 1 1 0 1
4 'e' 1 0 0 1
Or,
In [1284]: df[np.logical_and.reduce([df[x]==1 for x in cols])]
Out[1284]:
c1 c2 c3 c4 c5
0 'a' 1 1 0 1
4 'e' 1 0 0 1
Or,
In [1279]: df.query(' and '.join(['%s==1'%x for x in cols]))
Out[1279]:
c1 c2 c3 c4 c5
0 'a' 1 1 0 1
4 'e' 1 0 0 1
Upvotes: 1
Reputation: 3536
Can you try doing something like this:
df.loc[df['c2'] == 1 & df['c5'] == 1]
Upvotes: 0