Reputation: 176
Still learning and have previously done this with nested loops but I was wondering if there was a pretty and condensed way of filtering out a list of strings from another list of strings. I basically have a 300 column pandas dataframe and want to get rid of some columns from the dataframe if they have some key words. Then the plan is to specify the column titles to make a new dataframe.
Here are my attempts at list comprehension:
filter_array = ['hi', 'friend']
col_names = ['nice', 'to', 'meet', 'you' + 'friend']
p = [i for i in col_names if i not in filter_array]
print(p)
p = [i for i in col_names if e for e in filter_array e not in i]
print(p)
p = [i for i in col_names if e not in i for e in filter_array]
print(p)
The first attempt works but doesn't remove 'you+friend' where the filter word is present but exactly equal the col name so is kept. My last attempt gives 'e is referenced before assignment'
Also why isn't there tag for pythonic! :)
Thanks guys and gals
Upvotes: 4
Views: 99
Reputation: 48018
I think this gets you the result you're looking for:
>>> filter_array = ['hi', 'friend']
>>> col_names = ['nice', 'to', 'meet', 'you' + 'friend']
>>>
>>> [c for c in col_names if all([f not in c for f in filter_array])]
['nice', 'to', 'meet']
It's worth noting (from the comments) that you can drop the inner []
in the call to all
to change that inner list comprehension into a generator expression. The list comprehension will use more memory, but will outperform a generator expression in the event that all steps of the generator must be consumed (when all
can't shortcircuit). You can also reverse the logic using any
instead of all
. Ex:
>>> [c for c in col_names if all(f not in c for f in filter_array)]
['nice', 'to', 'meet']
>>> [c for c in col_names if not any(f in c for f in filter_array)]
['nice', 'to', 'meet']
Upvotes: 5
Reputation: 632
Potentially a more efficient way of doing this is to turn your list of tags into a set()
Then, you can do something like filteredSet = setA - setB
, which will result in a copy of setA with the elements in setB removed from it.
Upvotes: 0