Reputation: 4036
I'm using lxml and I have a xml like this:
<UploadFile>
<Eu>
<AUTO_ID>4</AUTO_ID>
<Meter>000413031</Meter>
</Eu>
</UploadFile>
How can I only get the tag that has text like AUTO_ID and Meter,but not UploadFile Eu?
I have tried:
tree = lxml.etree.parse(xmlfile)
root = tree.getroot()
for node in root.iter('*'):
if node.text != None:
print(node.tag,node.text)
But still I can get all the tags,I only want the tag has text with it,what can I do ?Any friend can help?Best regards!
Upvotes: 0
Views: 55
Reputation: 89325
Unlike xml.etree
, lxml supports more complex XPath expression including XPath that return all descendant elements that have child text node that isn't empty or white-space-only:
for node in root.xpath(".//*[text()[normalize-space()]]"):
print(node.tag,node.text)
Upvotes: 0
Reputation: 3158
In your for loop, you can remove the spaces using strip() then check if len>0 or can check for none using if node.text.strip()
option 1:
import lxml
tree = lxml.etree.parse("my_xml.xml")
root = tree.getroot()
for node in root.iter('*'):
if len(node.text.strip()) > 0: # check if len > 0, text will have some length
print(node.tag,node.text)
option 2:
import lxml
tree = lxml.etree.parse("my_xml.xml")
root = tree.getroot()
for node in root.iter('*'): # checking if its None
if node.text.strip():
print(node.tag,node.text)
Upvotes: 1