How can I replace substrings without replacing all at the same time? Python

Question

I have written a really good program that uses text files as word banks for generating sentences from sentence skeletons. An example:

The skeleton
"The noun is good at verbing nouns"
can be made into a sentence by searching a word bank of nouns and verbs to replace "noun" and "verb" in the skeleton. I would like to get a result like
"The dog is good at fetching sticks"

Unfortunately, the handy replace() method was designed for speed, not custom functions in mind. I made methods that accomplish the task of selecting random words from the right banks, but doing something like skeleton = skeleton.replace('noun', getNoun(file.txt)) replaces ALL instances of 'noun' with the single call of getNoun(), instead of calling it for each replacement. So the sentences look like

"The dog is good at fetching dogs"

How might I work around this feature of replace() and make my method get called for each replacement? My minimum length code is below.

import random

def getRandomLine(rsv):
    #parameter must be a return-separated value text file whose first line contains the number of lines in the file.
    f = open(rsv, 'r') #file handle on read mode
    n = int(f.readline()) #number of lines in file
    n = random.randint(1, n) #line number chosen to use
    s = "" #string to hold data
    for x in range (1, n):
        s = f.readline()
    s = s.replace("
", "")
    return s

def makeSentence(rsv):
    #parameter must be a return-separated value text file whose first line contains the number of lines in the file.
    pattern = getRandomLine(rsv) #get a random pattern from file
    #replace word tags with random words from matching files
    pattern = pattern.replace('noun', getRandomLine('noun.txt'))
    pattern = pattern.replace('verb', getRandomLine('verb.txt'))

    return str(pattern);

def main():
    result = makeSentence('pattern.txt');
    print(result)

main()

user2357112 · Accepted Answer

The re module's re.sub function does the job str.replace does, but with far more abilities. In particular, it offers the ability to pass a function for the replacement, rather than a string. The function is called once for each match with a match object as an argument and must return the string that will replace the match:

import re
pattern = re.sub('noun', lambda match: getRandomLine('noun.txt'), pattern)

The benefit here is added flexibility. The downside is that if you don't know regexes, the fact that the replacement interprets 'noun' as a regex may cause surprises. For example,

>>> re.sub('Aw, man...', 'Match found.', 'Aw, manatee.')
'Match found.e.'

If you don't know regexes, you may want to use re.escape to create a regex that will match the raw text you're searching for even if the text contains regex metacharacters:

>>> re.sub(re.escape('Aw, man...'), 'Match found.', 'Aw, manatee.')
'Aw, manatee.'

How can I replace substrings without replacing all at the same time? Python

Answers (2)

Related Questions