Select rows based on condition in a set of columns

Question

I have a df

df = pd.DataFrame({'s1':[1,1],'s2':['-',1],'s3':[1,'-'], 's4':[0,1], 's5':[0,1], 's6':['-',1], 's7':[0,'-'], 's8':[0,0]})  

    s1  s2  s3  s4  s5  s6  s7  s8  
0   1   -   1   0   0   -   0   0  
1   1   1   -   1   1   1   -   0

I would like to select rows based on conditions in multiple sets of columns, such that
'1' is observed at least once in s1, s2, s3 or s4 and '1' is not observed in s5, s6, s7 or s8
OR
'1' is observed at least once in s5, s6 or s7 and '1' is not observed in s1, s2, s3, s4, s7 or s8

The result should be only the first row

Vaishali · Accepted Answer

Define two sets of conditions and use boolean indexing,

cond1 = df[['s1', 's2', 's3','s4']].eq(1).any(1)
cond2 = df[['s5', 's6', 's7','s8']].ne(1).all(1)

df.loc[(cond1 & cond2)]


    s1  s2  s3  s4  s5  s6  s7  s8
0   1   -   1   0   0   -   0   0

Select rows based on condition in a set of columns

Answers (2)

Related Questions