Reputation: 25547
I used BeautifulSoup to handle XML files that I have collected through a REST API.
The responses contain HTML code, but BeautifulSoup escapes all the HTML tags so it can be displayed nicely.
Unfortunately I need the HTML code.
How would I go on about transforming the escaped HTML into proper markup?
Help would be very much appreciated!
Upvotes: 8
Views: 5537
Reputation: 73155
You could try the urllib module?
It has a method unquote()
that might suit your needs.
Edit: on second thought, (and more reading of your question) you might just want to just use string.replace()
Like so:
string.replace('<','<')
string.replace('>','>')
Upvotes: 2
Reputation: 881537
I think you want xml.sax.saxutils.unescape from the Python standard library.
E.g.:
>>> from xml.sax import saxutils as su
>>> s = '<foo>bar</foo>'
>>> su.unescape(s)
'<foo>bar</foo>'
Upvotes: 20