Reputation: 3
In below code, the string "Graph" is replacing the matched regex:
htmlText = re.sub("[0-9]*/index.html", 'Graph', htmlText, re.MULTILINE|re.DOTALL)
But the problem is, I want to prepend 'Graph' to the beginning of the matched '[0-9]*/index.html'
expression, not replace it.
Upvotes: 0
Views: 1385
Reputation: 8402
You want to capture the match (by surrounding your regex with parens), then backreference it (via \1
), using a raw string (via r
before the replacement string) to prevent the backslash from being treated as an escape character:
In [1]: import re
In [2]: htmlText = "5/index.html"
In [3]: re.sub("([0-9]*/index.html)", r'Graph\g<1>', htmlText, re.MULTILINE|re.DOTALL)
Out[3]: 'Graph5/index.html'
Edit: Changed r'Graph\1'
to r'Graph\g<1>'
above, since that's more reliable in case someone uses this answer in a context where the backreference is followed by another number -- see docs https://docs.python.org/2/library/re.html#re.sub which cite:
\g<2>
is therefore equivalent to\2
, but isn’t ambiguous in a replacement such as\g<2>0
Note: Example above uses Python 2.7.6.
Upvotes: 2