shantanuo
shantanuo

Reputation: 32296

replace strings with placeholders in python

I am trying to replace the variables with placeholders like XXX. The words "hello" and "morning" are printed as-is because they appear in another list. The following code works, but prints extra placeholders.

import re

mylist = ['hello', 'morning']
nm = [
    "Hello World Robot Morning.",
    "Hello Aniket Fine Morning.",
    "Hello Paresh Good and bad Morning.",
]



def punctuations(string):
    pattern = re.compile(r"(?u)\b\w\w+\b")
    result = pattern.match(string)
    myword = result.group()
    return myword


for x in nm:
    newlist = list()
    for y in x.split():
        for z in mylist:
            if z.lower() == punctuations(y.lower()):
                newlist.append(y)
            else:
                newlist.append("xxx")
    print(newlist)

Output:

['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']
['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']
['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']

Expected output:

['Hello', 'xxx', 'xxx',  'Morning.']
['Hello', 'xxx', 'xxx',   'Morning.']
['Hello', 'xxx', 'xxx', 'xxx', 'xxx', 'Morning.']

Upvotes: 1

Views: 293

Answers (2)

rivamarco
rivamarco

Reputation: 767

You have to break when you have found the word and only after checking all the elements in my_list evaluate if you have found something, and if not, append the placeholder

for x in nm:
    newlist = list()
    for y in x.split():
        for z in mylist:
            if z.lower() == punctuations(y.lower()):
                newlist.append(y)
                break
        else:
            newlist.append('xxx')
    print(newlist)

Upvotes: 2

JosefAssad
JosefAssad

Reputation: 4118

You're reaching for python's vanilla string functions and regular expressions when actually your problem is better solved with formal parsing using Parsing Expression Grammar (PEP):

For example:

import pyparsing as pp

expr = pp.OneOrMore(pp.Word("hello") | pp.Word("world") | pp.Word(pp.alphas).setParseAction(pp.replaceWith("XXX")))

expr.parseString("hello foo bar world")

Yields:

(['hello', 'XXX', 'XXX', 'world'], {})

See module pyParsing and docs.

Upvotes: 3

Related Questions