Reputation:
I have a list of word phrases and a string as follows.
mylist = ['and rock, 'shake well', 'the']
mystring = "the sand rock need to be mixed and shake well"
I want to replace the words in mylist
with ""
.
I am currently using replace
method in python as follows.
for item in mylist:
mystring = mystring.replace(item, "")
Howver, I noted that it does not work well for all my sentences. For example in mystring
it has a fake match with sand rock
and output as follows.
s need to be mixed and
Howver, I want it to be as;
sand rock need to be mixed and
Is there a better way of doing this in python?
Upvotes: 1
Views: 404
Reputation: 3547
Using re.sub
and applying \b
(word boundary) to match exact string
import re
re.sub('\b'+'|'.join(mylist), '', mystring)
#' sand rock need to be mixed and '
Upvotes: 0
Reputation: 4855
The problem is that str.replace()
doesn't allow you to specify that you only want to match whole words (or phrases). The re
module allows you to use regular expressions (regex) for pattern matching. With regex, you can specify word boundaries using the \b
escape. Place the \b
escape before and after your phrases to cause the match to only occur at word boundaries. The re.sub()
function works like the str.replace()
method and you can use it in your code like:
import re
mylist = ['and rock', 'shake well', 'the']
mystring = "the sand rock need to be mixed and shake well"
for item in mylist:
mystring = re.sub(r"\b{}\b".format(item), "", mystring)
print(mystring)
Out[6]: ' sand rock need to be mixed and '
Upvotes: 3
Reputation: 5682
Part of the trick of your problem is that you don't want to match partial words. That's why the replace()
method does not do what you want it to do. You can achieve what you want through regular expressions. One of the nice thing about REs is that you can match on word boundaries using the \b
flag.
Upvotes: 2