Alberto Valdivia
Alberto Valdivia

Reputation: 25

How to change tags with lxml in Python?

I want to change all tags names <p> to <para> using lxml in python.

Here's an example of what the xml file looks like.

<concept id="id15CDB0Q0Q4G"><title id="id15CDB0R0VHA">General</title>
<conbody><p>This section</p></conbody>
<concept id="id156F7H00GIE"><title id="id15CDB0R0V1W">
System</title>
<conbody><p> </p>
<p>The 
</p>
<p>status.</p>
<p>sensors.</p>

And I've been trying to code it like this but it doesn't find the tags with .findall.

from lxml import etree
doc = etree.parse("73-20.xml")
print("\n")

print(etree.tostring(doc, pretty_print=True, xml_declaration=True, encoding="utf-8"))
print("\n")

raiz = doc.getroot()
print(raiz.tag)
children = raiz.getchildren()
print(children)
print("\n")

libros = doc.findall("p")
print(libros)
print("\n")

for i in range(len(libros)):
    if libros[i].find("p").tag == "p" :
        libros[i].find("p").tag = "para"

Any thoughts?

Upvotes: 0

Views: 762

Answers (1)

Alexandra Dudkina
Alexandra Dudkina

Reputation: 4462

lxml findall() function provides ability to search by path:

libros = raiz.findall(".//p")

for el in libros:
    el.tag = "para"

Here .//p means that lxml will search nested p elements as well.

Upvotes: 1

Related Questions