Reputation: 235
I want to port this code from python2 to python3
p = re.compile(ur"(?s)<u>(.*?)<\/u>")
subst = "\underline{\\1}"
raw_html = re.sub(p, subst, raw_html)
I figured out already that the ur
shall be changed to just r
:
p = re.compile(r"(?s)<u>(.*?)<\/u>")
subst = "\underline{\\1}"
raw_html = re.sub(p, subst, raw_html)
however it does not work it complains about this:
cd build && PYTHONWARNINGS="ignore" python3 ../src/katalog.py --katalog 1
Traceback (most recent call last):
File "src/katalog.py", line 11, in <module>
from common import *
File "src/common.py", line 207
subst = "\underline{\\1}"
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape
make: *** [katalog1] Error 1
however changing it to "\underline" does not help either. It is not replacing it then.
Upvotes: 2
Views: 86
Reputation: 18611
Use
import re
raw_html = r"<u>1</u> and <u>2</u>"
p = re.compile(r"(?s)<u>(.*?)</u>")
subst = r"\\underline{\1}"
raw_html = re.sub(p, subst, raw_html)
print(raw_html)
See Python proof, the results are \underline{1} and \underline{2}
. Basically, inside replacement, use double backslash to replace with a single backslash. Use raw string literals to make life easier with regex in Python.
Upvotes: 2