Reputation: 1017
I have the following dataframe that I read from csv file:
gene name annotation ng DNA
0 HRAS G12S 3.00
1 PIK3CA R88L 3.00
2 BRAF E474A 3.00
3 EGFR E734Q 3.00
4 EGFR V769 3.00
5 BRAF LQ599PE 4.00
6 BRAF KT587NA 4.00
7 HRAS G12S 17.70
I'd like to filter according to multiple conditions in 2 columns: for example to filter according to 'BRAF' + 'E474A' and 'HRAS'+ 'G12S', so the following df will be created:
gene name annotation ng DNA
0 HRAS G12S 3.00
2 BRAF E474A 3.00
7 HRAS G12S 17.70
Any ideas for an elegant solution?
Upvotes: 1
Views: 29
Reputation: 862611
Use boolean indexing
and for join all masks to one by np.logical_or.reduce
:
m1 = (df['gene name'] == 'BRAF') & (df['annotation'] == 'E474A')
m2 = (df['gene name'] == 'HRAS') & (df['annotation'] == 'G12S')
df = df[np.logical_or.reduce([m1, m2])]
print (df)
gene name annotation ng DNA
0 HRAS G12S 3.0
2 BRAF E474A 3.0
7 HRAS G12S 17.7
More dynamic solution with list of tuples for values for filter in list comprehension
:
tup = [('BRAF','E474A'), ('HRAS', 'G12S')]
df = df[np.logical_or.reduce([(df['gene name']== a)&(df['annotation']== b) for a, b in tup])]
print (df)
gene name annotation ng DNA
0 HRAS G12S 3.0
2 BRAF E474A 3.0
7 HRAS G12S 17.7
Upvotes: 1