Reputation: 2981
I have a dataframe that looks like this:
u1 u2 u3 u4 u5 u6
level0 level1
foo1 x1 0 1 0 0 0 0
x2 0 1 1 0 1 1
foo2 x3 0 1 0 1 0 1
x4 1 0 0 0 1 1
foo3 x5 1 0 1 0 0 0
x6 0 1 1 1 0 0
foo4 x7 1 0 0 1 0 1
x8 0 1 1 1 0 0
I want to subselect only those indices for which u3==1
. So, as output, I should get something like:
u1 u2 u3 u4 u5 u6
level0 level1
foo1
x2 0 1 1 0 1 1
foo2
foo3 x5 1 0 1 0 0 0
x6 0 1 1 1 0 0
foo4
x8 0 1 1 1 0 0
I have tried doing:
idx = pd.IndexSlice
df.loc[idx[:,:],'u2']==1
which gives:
level0 level1
foo1 x1 True
x2 True
foo2 x3 True
x4 False
foo3 x5 False
x6 True
foo4 x7 False
x8 True
but I don't know how to use this to index the original dataframe.
Any help appreciated.
Upvotes: 1
Views: 338
Reputation: 210842
you can use query() method or regular boolean indexing:
In [11]: df.query('u2 == 1')
Out[11]:
u1 u2 u3 u4 u5 u6
level0 level1
foo1 x1 0 1 0 0 0 0
x2 0 1 1 0 1 1
foo2 x3 0 1 0 1 0 1
foo3 x6 0 1 1 1 0 0
foo4 x8 0 1 1 1 0 0
In [12]: df.loc[df['u2'] == 1]
Out[12]:
u1 u2 u3 u4 u5 u6
level0 level1
foo1 x1 0 1 0 0 0 0
x2 0 1 1 0 1 1
foo2 x3 0 1 0 1 0 1
foo3 x6 0 1 1 1 0 0
foo4 x8 0 1 1 1 0 0
using .query()
method also allows you to search by index levels:
In [17]: df.query("level0 in ['foo2','foo3'] and u2 == 1")
Out[17]:
u1 u2 u3 u4 u5 u6
level0 level1
foo2 x3 0 1 0 1 0 1
foo3 x6 0 1 1 1 0 0
UPDATE:
how can I select all the u? for which the x1==1 and x3==1?
if you mean u1
and u3
then there are quite a few ways to achieve that:
In [8]: df.query("u1 == 1 and u3 == 1")
Out[8]:
u1 u2 u3 u4 u5 u6
level0 level1
foo3 x5 1 0 1 0 0 0
In [9]: df.loc[(df['u1'] == 1) & (df['u3'] == 1)]
Out[9]:
u1 u2 u3 u4 u5 u6
level0 level1
foo3 x5 1 0 1 0 0 0
In [10]: df.loc[df[['u1','u3']].eq(1).all(1)]
Out[10]:
u1 u2 u3 u4 u5 u6
level0 level1
foo3 x5 1 0 1 0 0 0
Upvotes: 1