Reputation: 3453
Let's assume I have following code:
<div id="first">
<div id="second">
<a></a>
<ul>...</ul>
</div>
</div>
Here's my code:
div_parents = root_element.xpath('//div[div]')
for div in reversed(div_parents):
if len(div.getchildren()) == 1:
# remove second div and replace it with it's content
I'm reaching div's with div children and then I want to remove the child div if that's the only child it's parent has. The result should be:
<div id="first">
<a></a>
<ul>...</ul>
</div>
I wanted to do it with:
div.replace(div.getchildren()[0], div.getchildren()[0].getchildren())
but unfortunatelly, both arguments on replace should consist of only one element. Is there something easier than reassigning all properties of first div to second div and then replacing both? - because I could do it easily with:
div.getparent().replace(div, div.getchildren()[0])
Upvotes: 0
Views: 426
Reputation: 6395
I'd just use list-replacement:
from lxml.etree import fromstring, tostring
xml = """<div id="first">
<div id="second">
<a></a>
<ul>...</ul>
</div>
</div>"""
doc = fromstring(xml)
outer_divs = doc.xpath("//div[div]")
for outer in outer_divs:
outer[:] = list(outer[0])
print tostring(doc)
Upvotes: 1
Reputation: 30210
Consider using copy.deepcopy
as suggested in the docs:
For example:
div_parents = root_element.xpath('//div[div]')
for outer_div in div_parents:
if len(outer_div.getchildren()) == 1:
inner_div = outer_div[0]
# Copy the children of innder_div to outer_div
for e in inner_div: outer_div.append( copy.deepcopy(e) )
# Remove inner_div from outer_div
outer_div.remove(inner_div)
Full code used to test:
import copy
import lxml.etree
def pprint(e): print(lxml.etree.tostring(e, pretty_print=True))
xml = '''
<body>
<div id="first">
<div id="second">
<a>...</a>
<ul>...</ul>
</div>
</div>
</body>
'''
root_element = lxml.etree.fromstring(xml)
div_parents = root_element.xpath('//div[div]')
for outer_div in div_parents:
if len(outer_div.getchildren()) == 1:
inner_div = outer_div[0]
# Copy the children of innder_div to outer_div
for e in inner_div: outer_div.append( copy.deepcopy(e) )
# Remove inner_div from outer_div
outer_div.remove(inner_div)
pprint(root_element)
Output:
<body>
<div id="first">
<a>...</a>
<ul>...</ul>
</div>
</body>
Note: The enclosing <body>
tag in the test code is unnecessary, I was just using it for testing multiple cases. The test code operates without issue on your input.
Upvotes: 1