Reputation:
I need to build multiple filter on 2 columns structure of table is 7 columns , but first 'query' and last 'template' is filtering
I done it beforeand it worked but now (1 year later) i cant figure out whats wrong.
for item in glob.glob('D:\\path\\*.change'):
table = pd.read_csv(item, sep='\t', index_col=None)
#FILTERING
filtered_table = table[
(table['query'].str.contains("egg*", regex=True)==False) &
(table['query'].str.contains(".*phospho*", regex=True)==False) &
(table['query'].str.contains("vipe", regex=True)==False) &
(table['template'].str.contains("ABC1")) |
(table['template'].str.contains("bender")) ]
Expected result is the table without rows containing strings - egg*, .phospho, vipe in column 'query' AND rows in column 'template' which contain 'ABC1' or 'bender'.
Upvotes: 1
Views: 5373
Reputation:
My answer for problem:
for item in glob.glob('D:\\path\\*.change'):
table = pd.read_csv(item, sep='\t', index_col=None)
#FILTERING
query_table = table[
(table['query'].str.contains("egg*", regex=True)==False) &
(table['query'].str.contains(".*phospho*", regex=True)==False) &
(table['query'].str.contains("vipe", regex=True)==False) ]
filtered_table = query_table[
(query_table['template'].str.contains("ABC1")) |
(query_table['template'].str.contains("bender")) ]
Upvotes: 3
Reputation: 2533
I think there's something with the missing brackets in your condition.
Try this:
table[(
# AND condition
table['query'].str.contains("egg*", regex=True)==False &
table['query'].str.contains(".*phospho*", regex=True)==False &
table['query'].str.contains("vipe", regex=True)==False &
# OR condition
(table['template'].str.contains("ABC1") |
table['template'].str.contains("bender"))
)]
Upvotes: 2