python process multiple string at the same time

Question

I have a list of strings and i want to remove the stop words inside each string. The thing is, the length of the stopwords is much longer than the strings and I don't want to repeat comparing each string with the stopwords list. Is there a way in python that these multiple strings at the same time?

lis = ['aka', 'this is a good day', 'a pretty dog']
stopwords = [] # pretty long list of words
for phrase in lis:
    phrase = phrase.split(' ') # get list of words
    for word in phrase:
        if stopwords.contain(word):
            phrase.replace(word, '')

This is my current method. But these means I have to go through all the phrases in the list. Is there a way that I can process these phrases with only one time compare?

Thanks.

Cory Kramer · Accepted Answer

This is the same idea, but with a few improvements. Convert your list of stopwords to a set for faster lookups. Then you can iterate over your phrase list in a list comprehension. You can then iterate over the words in the phrase, and keep them if they're not in the stop set, then join the phrase back together.

>>> lis = ['aka', 'this is a good day', 'a pretty dog']
>>> stopwords = ['a', 'dog']
>>> stop = set(stopwords)
>>> [' '.join(j for j in i.split(' ') if j not in stop) for i in lis]
['aka', 'this is good day', 'pretty']

python process multiple string at the same time

Answers (2)

Related Questions