Reputation: 1120
I'm using lxml to parse some HTML fragments (from a RSS feed), and in order to do this efficiently I use the create_parent='div'
. When i later output the HTML I don't want the parent div to be included since with my html layout it ends up being a div in a div, totally unnessecary.
The code as is now:
from lxml.html import fragment_fromstring
html = fragment_fromstring(html_string, create_parent = 'div')
for tag in html.xpath('//*[@class]'):
tag.attrib.pop('class')
for tag in html.xpath('//*[@id]'):
tag.attrib.pop('id')
return lxml.html.tostring(html)
TL;DR: how do I remove the wrapping div when it outputs?
Upvotes: 4
Views: 1291
Reputation: 369074
Extract child elements.
return '\n'.join(lxml.html.tostring(x) for x in html.iterchildren())
Upvotes: 2