S9oXavyF
S9oXavyF

Reputation: 105

Don't encode Element text object using Python ElementTree

I'm trying to use HTML data inside an the text node of an element, but it gets encoded as if it were meant to not be HTML data.

Here is an MWE:

from xml.etree import ElementTree as ET

data = '<a href="https://example.com">Example data gained from elsewhere.</a>'

p = ET.Element('p')
p.text = data
p = ET.tostring(p, encoding='utf-8', method='html').decode('utf8')
print(p)

The output is...

<p>&lt;a href="https://example.com"&gt;Example data gained from elsewhere.&lt;/a&gt;</p>

What I intended is...

<p><a href="https://example.com">Example data gained from elsewhere.</a></p>

Upvotes: 1

Views: 316

Answers (2)

Kris
Kris

Reputation: 8868

What you are doing is wrong. You are assigning p.text = data, which basically considers the node to be text content. Its quite obvious the text is escaped. You have to add it as a child. like below:

from xml.etree import ElementTree as ET

data = '<a href="https://example.com">Example data gained from elsewhere.</a>'

d = ET.fromstring(data)
p = ET.Element('p')

p.append(d)
p = ET.tostring(p, encoding='utf-8', method='html').decode('utf8')
print(p)

Giving output

<p><a href="https://example.com">Example data gained from elsewhere.</a></p>

Upvotes: 2

Richard Neumann
Richard Neumann

Reputation: 3361

You can parse the HTML string into an ElementTree object and append it to the DOM:

from xml.etree import ElementTree as ET

data = '<a href="https://example.com">Example data gained from elsewhere.</a>'

p = ET.Element('p')
p.append(ET.fromstring(data))
p = ET.tostring(p, encoding='utf-8', method='html').decode('utf8')
print(p)

Upvotes: 1

Related Questions