Gagan Deep
Gagan Deep

Reputation: 3

insert string at beginning of regex in python

In below code, the string "Graph" is replacing the matched regex:

htmlText = re.sub("[0-9]*/index.html", 'Graph', htmlText, re.MULTILINE|re.DOTALL)

But the problem is, I want to prepend 'Graph' to the beginning of the matched '[0-9]*/index.html' expression, not replace it.

Upvotes: 0

Views: 1385

Answers (1)

DreadPirateShawn
DreadPirateShawn

Reputation: 8402

You want to capture the match (by surrounding your regex with parens), then backreference it (via \1), using a raw string (via r before the replacement string) to prevent the backslash from being treated as an escape character:

In [1]: import re

In [2]: htmlText = "5/index.html"

In [3]: re.sub("([0-9]*/index.html)", r'Graph\g<1>', htmlText, re.MULTILINE|re.DOTALL)
Out[3]: 'Graph5/index.html'

Edit: Changed r'Graph\1' to r'Graph\g<1>' above, since that's more reliable in case someone uses this answer in a context where the backreference is followed by another number -- see docs https://docs.python.org/2/library/re.html#re.sub which cite:

\g<2> is therefore equivalent to \2, but isn’t ambiguous in a replacement such as \g<2>0

Note: Example above uses Python 2.7.6.

Upvotes: 2

Related Questions