Reputation: 133
I have a df
df = pd.DataFrame({'s1':[1,1],'s2':['-',1],'s3':[1,'-'], 's4':[0,1], 's5':[0,1], 's6':['-',1], 's7':[0,'-'], 's8':[0,0]})
s1 s2 s3 s4 s5 s6 s7 s8
0 1 - 1 0 0 - 0 0
1 1 1 - 1 1 1 - 0
I would like to select rows based on conditions in multiple sets of columns, such that
'1' is observed at least once in s1, s2, s3 or s4 and '1' is not observed in s5, s6, s7 or s8
OR
'1' is observed at least once in s5, s6 or s7 and '1' is not observed in s1, s2, s3, s4, s7 or s8
The result should be only the first row
Upvotes: 1
Views: 66
Reputation: 38415
Define two sets of conditions and use boolean indexing,
cond1 = df[['s1', 's2', 's3','s4']].eq(1).any(1)
cond2 = df[['s5', 's6', 's7','s8']].ne(1).all(1)
df.loc[(cond1 & cond2)]
s1 s2 s3 s4 s5 s6 s7 s8
0 1 - 1 0 0 - 0 0
Upvotes: 2
Reputation: 999
Here goes:
condition_1 = ((df.s1 == 1) | (df.s2 == 1) | (df.s3 == 1) | (df.s4 == 1)) & (df.s5 != 1) & (df.s6 != 1) & (df.s7 != 1) & (df.s8 != 1)
condition_2 = ((df.s5 == 1) | (df.s6 == 1) | (df.s7 == 1)) & (df.s1 != 1) & (df.s2 != 1) & (df.s3 != 1) & (df.s4 != 1) & (df.s8 != 1)
df[condition_1 | condition_2].head(1)
output:
s1 s2 s3 s4 s5 s6 s7 s8
0 1 - 1 0 0 - 0 0
Upvotes: 0