Reputation: 9450
I have a list as shown below:
exclude = ["please", "hi", "team"]
I have a string as follows:
text = "Hi team, please help me out."
I want my string to look as:
text = ", help me out."
effectively stripping out any word that might appear in the list exclude
I tried the below:
if any(e in text.lower()) for e in exclude:
print text.lower().strip(e)
But the above if
statement returns a boolean value and hence I get the below error:
NameError: name 'e' is not defined
How do I get this done?
Upvotes: 1
Views: 640
Reputation: 19753
if you are not worried about punctuation:
>>> import re
>>> text = "Hi team, please help me out."
>>> text = re.findall("\w+",text)
>>> text
['Hi', 'team', 'please', 'help', 'me', 'out']
>>> " ".join(x for x in text if x.lower() not in exclude)
'help me out'
In the above code, re.findall
will find all words and put them in a list.
\w
matches A-Za-z0-9
+
means one or more occurrence
Upvotes: 0
Reputation: 8335
Using simple methods:
import re
exclude = ["please", "hi", "team"]
text = "Hi team, please help me out."
l=[]
te = re.findall("[\w]*",text)
for a in te:
b=''.join(a)
if (b.upper() not in (name.upper() for name in exclude)and a):
l.append(b)
print " ".join(l)
Hope it helps
Upvotes: 0
Reputation: 1362
This is going to replace with spaces everything that is not alphanumeric or belong to the stopwords list, and then split the result into the words you want to keep. Finally, the list is joined into a string where words are spaced. Note: case sensitive.
' '.join ( re.sub('\W|'+'|'.join(stopwords),' ',sentence).split() )
Example usage:
>>> import re
>>> stopwords=['please','hi','team']
>>> sentence='hi team, please help me out.'
>>> ' '.join ( re.sub('\W|'+'|'.join(stopwords),' ',sentence).split() )
'help me out'
Upvotes: 0
Reputation: 251106
Something like this?
>>> from string import punctuation
>>> ' '.join(x for x in (word.strip(punctuation) for word in text.split())
if x.lower() not in exclude)
'help me out
If you want to keep the trailing/leading punctuation with the words that are not present in exclude
:
>>> ' '.join(word for word in text.split()
if word.strip(punctuation).lower() not in exclude)
'help me out.'
First one is equivalent to:
>>> out = []
>>> for word in text.split():
word = word.strip(punctuation)
if word.lower() not in exclude:
out.append(word)
>>> ' '.join(out)
'help me out'
Upvotes: 3
Reputation: 2052
You can use Use this (remember it is case sensitive)
for word in exclude:
text = text.replace(word, "")
Upvotes: 1