Reputation: 32296
I am trying to replace the variables with placeholders like XXX. The words "hello" and "morning" are printed as-is because they appear in another list. The following code works, but prints extra placeholders.
import re
mylist = ['hello', 'morning']
nm = [
"Hello World Robot Morning.",
"Hello Aniket Fine Morning.",
"Hello Paresh Good and bad Morning.",
]
def punctuations(string):
pattern = re.compile(r"(?u)\b\w\w+\b")
result = pattern.match(string)
myword = result.group()
return myword
for x in nm:
newlist = list()
for y in x.split():
for z in mylist:
if z.lower() == punctuations(y.lower()):
newlist.append(y)
else:
newlist.append("xxx")
print(newlist)
Output:
['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']
['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']
['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']
Expected output:
['Hello', 'xxx', 'xxx', 'Morning.']
['Hello', 'xxx', 'xxx', 'Morning.']
['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']
Upvotes: 1
Views: 293
Reputation: 767
You have to break when you have found the word and only after checking all the elements in my_list
evaluate if you have found something, and if not, append the placeholder
for x in nm:
newlist = list()
for y in x.split():
for z in mylist:
if z.lower() == punctuations(y.lower()):
newlist.append(y)
break
else:
newlist.append('xxx')
print(newlist)
Upvotes: 2
Reputation: 4118
You're reaching for python's vanilla string functions and regular expressions when actually your problem is better solved with formal parsing using Parsing Expression Grammar (PEP):
For example:
import pyparsing as pp
expr = pp.OneOrMore(pp.Word("hello") | pp.Word("world") | pp.Word(pp.alphas).setParseAction(pp.replaceWith("XXX")))
expr.parseString("hello foo bar world")
Yields:
(['hello', 'XXX', 'XXX', 'world'], {})
See module pyParsing and docs.
Upvotes: 3