Reputation: 1262
I have two lists
listA = ['New Delhi', 'Moscow', 'Berlin', 'France', 'To Washington']
stopwordlist = ['new', 'To']
I am trying to get something like this
finalList = ['Moscow', 'Berlin', 'France']
What I have tried until now works if I am looking for whole words:
listB = []
for item in listA:
if item not in stopwordlist:
listB.append(item)
else:
continue
....
....
return listB
We may split the item
then check those in the stopwordlist. But this seems like to many workarounds. Or I could use a regex re.match
.
Upvotes: 2
Views: 323
Reputation: 327
listA =['New Delhi','Moscow', 'Berlin','France', 'To Washington']
stopwordlist = ['new','To']
listA = [i.lower() for i in listA]
stopwordlist = [i.lower() for i in stopwordlist]
listB =[]
for item in listA:
flag = True
for i in item.split(' '):
if i in stopwordlist:
flag =False
if flag:
listB.append(item)
print(listB)
Upvotes: 1
Reputation: 4189
sl = tuple(i.lower() for i in stopwordlist)
[i for i in listA if not i.lower().startswith(sl)]
Output
['Moscow', 'Berlin', 'France']
Upvotes: 2
Reputation: 17322
you have to lower your stop words also the words against you run you stopwords:
listA = ['New Delhi', 'Moscow', 'Berlin', 'France', 'To Washington']
stopwordlist = ['new', 'To']
stop_words = {e.lower() for e in stopwordlist}
finalList = [e for e in listA if not stop_words.intersection(e.lower().split())]
or you can use regex:
import regex as re
stop_words_regex = re.compile(r"\L<words>", words=stop_words)
finalList = [e for e in listA if not stop_words_regex.findall(e.lower())]
Output:
['Moscow', 'Berlin', 'France']
Upvotes: 0
Reputation: 24562
Here is one way to do this,
>>> listA = ['New Delhi', 'Moscow', 'Berlin', 'France', 'To Washington']
>>> stopwordlist = ['new', 'To']
>>> finalList = [i for i in listA if not any(j.lower() in i.lower() for j in stopwordlist)]
>>> finalList
['Moscow', 'Berlin', 'France']
or You could use the builtin filter
function.
>>> listA = ['New Delhi', 'Moscow', 'Berlin', 'France', 'To Washington']
>>> stopwordlist = ['new', 'To']
>>> list(filter(lambda x: not any(j.lower() in x.lower() for j in stopwordlist), listA))
['Moscow', 'Berlin', 'France']
Upvotes: 2