Reputation: 759
I want to escape the unescaped data inside a xml string e.g.
string = "<tag parameter = "something">I want to escape these >, < and &</tag>"
to
"<tag parameter = "something">I want to escape these >, < and &</tag>"
In regex, I figure out way to match & get start and end positions of data substing
exp = re.search(">.+?</", label)
# Get position of the data between tags
start = exp.start() + 1
end = exp.end() - 2
return label[ : start] + saxutils.escape(label[start : end]) + label[end : ]
But in re.search, I can't match the exact xml format
Upvotes: 2
Views: 409
Reputation: 6466
Perhaps you should be considering re.sub
:
>>> oldString = '<tag parameter = "something">I want to escape these >, < and &</tag>'
>>> newString = re.sub(r"(<tag.*?>)(.*?)</tag>", lambda m: m.group(1) + cgi.escape(m.group(2)) + "</tag>", oldString)
>>> print newString
<tag parameter = "something">I want to escape these >, < and &</tag>
My warning is that the regular expression will definitely break if you have nested tags. See Why is it such a bad idea to parse XML with regex?
Upvotes: 3