Scott H
Scott H

Reputation: 33

Replace word between two substrings (keeping other words)

I'm trying to replace a word (e.g. on) if it falls between two substrings (e.g. <temp> & </temp>) however other words are present which need to be kept.

string = "<temp>The sale happened on February 22nd</temp>"

The desired string after the replace would be:

Result = <temp>The sale happened {replace} February 22nd</temp>

I've tried using regex, I've only been able to figure out how to replace everything lying between the two <temp> tags. (Because of the .*?)

result  = re.sub('<temp>.*?</temp>', '{replace}', string, flags=re.DOTALL)

However on may appear later in the string not between <temp></temp> and I wouldn't want to replace this.

Upvotes: 3

Views: 201

Answers (2)

gaganso
gaganso

Reputation: 3011

re.sub('(<temp>.*?) on (.*?</temp>)', lambda x: x.group(1)+" <replace> "+x.group(2), string, flags=re.DOTALL)

Output:

<temp>The sale happened <replace> February 22nd</temp>

Edit:

Changed the regex based on suggestions by Wiktor and HolyDanna.

P.S: Wiktor's comment on the question provides a better solution.

Upvotes: 1

user5547025
user5547025

Reputation:

Try lxml:

from lxml import etree

root = etree.fromstring("<temp>The sale happened on February 22nd</temp>")
root.text = root.text.replace(" on ", " {replace} ")
print(etree.tostring(root, pretty_print=True))

Output:

<temp>The sale happened {replace} February 22nd</temp>

Upvotes: 0

Related Questions