Reputation: 25
I want to change all tags names <p>
to <para>
using lxml in python.
Here's an example of what the xml file looks like.
<concept id="id15CDB0Q0Q4G"><title id="id15CDB0R0VHA">General</title>
<conbody><p>This section</p></conbody>
<concept id="id156F7H00GIE"><title id="id15CDB0R0V1W">
System</title>
<conbody><p> </p>
<p>The
</p>
<p>status.</p>
<p>sensors.</p>
And I've been trying to code it like this but it doesn't find the tags with .findall.
from lxml import etree
doc = etree.parse("73-20.xml")
print("\n")
print(etree.tostring(doc, pretty_print=True, xml_declaration=True, encoding="utf-8"))
print("\n")
raiz = doc.getroot()
print(raiz.tag)
children = raiz.getchildren()
print(children)
print("\n")
libros = doc.findall("p")
print(libros)
print("\n")
for i in range(len(libros)):
if libros[i].find("p").tag == "p" :
libros[i].find("p").tag = "para"
Any thoughts?
Upvotes: 0
Views: 762
Reputation: 4462
lxml findall()
function provides ability to search by path:
libros = raiz.findall(".//p")
for el in libros:
el.tag = "para"
Here .//p
means that lxml will search nested p
elements as well.
Upvotes: 1