guettli
guettli

Reputation: 27969

xml.etree.ElementTree: How to replace like "innerHTML"?

I want to replace the <h1> tag of a html page.

But the content of the heading can be HTML (not just a string).

I want to insert foo <b>bold</b> bar

input:

start 
<h1 class="myclass">bar <i>italic</i></h1>
end

Desired output:

start 
<h1 class="myclass">foo <b>bold</b> bar</h1>
end

How to solve this with Python?

Upvotes: 0

Views: 259

Answers (2)

seagulf
seagulf

Reputation: 378

using htql:

page="""start 
<h1 class="myclass">bar <i>italic</i></h1>
end
"""
import htql
x = htql.query(page, "<h1>:tx &replace('foo <b>bold</b> bar') ")[0][0]

You get:

>>> x
'start \n<h1 class="myclass">foo <b>bold</b> bar</h1>\nend\n'

Upvotes: 0

guettli
guettli

Reputation: 27969

parser = HTMLParser(namespaceHTMLElements=False)
etree = parser.parse('start <h1 class="myclass">bar <i>italic</i></h1> end')
for h1 in etree.findall('.//h1'):
    for sub in h1:
        h1.remove(sub)
    html = parser.parse('foo <b>bold</b> bar')
    body = html.find('.//body')
    for sub in body:
        h1.append(sub)
    h1.text = body.text
print(ElementTree.tostring(etree))

Upvotes: -1

Related Questions