user1452759
user1452759

Reputation: 9450

Python how to strip a string from a string based on items in a list

I have a list as shown below:

exclude = ["please", "hi", "team"]

I have a string as follows:

text = "Hi team, please help me out."

I want my string to look as:

text = ", help me out."

effectively stripping out any word that might appear in the list exclude

I tried the below:

if any(e in text.lower()) for e in exclude:
         print text.lower().strip(e)

But the above if statement returns a boolean value and hence I get the below error:

NameError: name 'e' is not defined

How do I get this done?

Upvotes: 1

Views: 640

Answers (5)

Hackaholic
Hackaholic

Reputation: 19753

if you are not worried about punctuation:

>>> import re
>>> text = "Hi team, please help me out."
>>> text = re.findall("\w+",text)
>>> text
['Hi', 'team', 'please', 'help', 'me', 'out']
>>> " ".join(x for x in text if x.lower() not in exclude)
'help me out'

In the above code, re.findall will find all words and put them in a list.
\w matches A-Za-z0-9
+ means one or more occurrence

Upvotes: 0

The6thSense
The6thSense

Reputation: 8335

Using simple methods:

import re
exclude = ["please", "hi", "team"]
text = "Hi team, please help me out."
l=[]

te = re.findall("[\w]*",text)
for a in te:
    b=''.join(a)
    if (b.upper() not in (name.upper() for name in exclude)and a):
        l.append(b)
print " ".join(l)

Hope it helps

Upvotes: 0

Lord Henry Wotton
Lord Henry Wotton

Reputation: 1362

This is going to replace with spaces everything that is not alphanumeric or belong to the stopwords list, and then split the result into the words you want to keep. Finally, the list is joined into a string where words are spaced. Note: case sensitive.

' '.join ( re.sub('\W|'+'|'.join(stopwords),' ',sentence).split() )

Example usage:

>>> import re
>>> stopwords=['please','hi','team']
>>> sentence='hi team, please help me out.'
>>> ' '.join ( re.sub('\W|'+'|'.join(stopwords),' ',sentence).split() )
'help me out'

Upvotes: 0

Ashwini Chaudhary
Ashwini Chaudhary

Reputation: 251106

Something like this?

>>> from string import punctuation
>>> ' '.join(x for x in (word.strip(punctuation) for word in text.split())
                                                   if x.lower() not in exclude)
'help me out

If you want to keep the trailing/leading punctuation with the words that are not present in exclude:

>>> ' '.join(word for word in text.split()
                             if word.strip(punctuation).lower() not in exclude)
'help me out.'

First one is equivalent to:

>>> out = []
>>> for word in text.split():
        word = word.strip(punctuation)
        if word.lower() not in exclude:
            out.append(word)
>>> ' '.join(out)
'help me out'

Upvotes: 3

Ashwani
Ashwani

Reputation: 2052

You can use Use this (remember it is case sensitive)

for word in exclude:
    text = text.replace(word, "")

Upvotes: 1

Related Questions