Alexander
Alexander

Reputation: 29

Skip XML tag if attribute is missing

I am trying to get data from the XML file underneath. For each type the explanation should be right next to it.

For example:

Orange They belong to the Citrus.They cannot grow at a temperature below Lemon They belong to the Citrus.They cannot grow at a temperature below

<Fruits>
    <Fruit>
        <Family>Citrus</Family>
        <Explanation>They belong to the Citrus.They cannot grow at a temperature below</Explanation>
        <Type>Orange</Type>
        <Type>Lemon</Type>
        <Type>Lime</Type>
        <Type>Grapefruit</Type>
    </Fruit>
        <Fruit>
        <Family>Pomes</Family>
        <Type>Apple</Type>
        <Type>Pear</Type>        
    </Fruit>
</Fruits>

This works well with the code underneath. However, for the second Fruit Family I have a problem, because there is no Explanation.

import os
from xml.etree import ElementTree
file_name = "example.xml"
full_file = os.path.abspath(os.path.join("xml", file_name))
dom = ElementTree.parse(full_file)
Fruit = dom.findall("Fruit")

for f in Fruit:
    Explanation = f.find("Explanation").text
    Types = f.findall("Type")
    for t in Types:
       Type = t.text
       print ("{0}, {1}".format(Type, Explanation))

How could I skip the tags like Fruit Family(Pomes), if the attribute Explanation is missing?

Upvotes: 1

Views: 1636

Answers (1)

Padraic Cunningham
Padraic Cunningham

Reputation: 180441

Using xml.etree, just try to find the Explanation child:

from  xml.etree import ElementTree as et
root = et.fromstring(xml)

for node in root.iter("Fruit"):
    if node.find("Explanation") is not None:
        print(node.find("Family").text)

You could also use an xpath where you get Fruit nodes only if they have an Explanation child using lxml:

import lxml.etree as et

root = et.fromstring(xml)

for node in root.xpath("//Fruit[Explanation]"):
     print(node.xpath("Family/text()"))

If we run it on your sample you will see we just get Citrus:

In [1]: xml = """<Fruits>
   ...:     <Fruit>
   ...:         <Family>Citrus</Family>
   ...:         <Explanation>They belong to the Citrus.They cannot grow at a temperature below</Explanation>
   ...:         <Type>Orange</Type>
   ...:         <Type>Lemon</Type>
   ...:         <Type>Lime</Type>
   ...:         <Type>Grapefruit</Type>
   ...:     </Fruit>
   ...:         <Fruit>
   ...:         <Family>Pomes</Family>
   ...:         <Type>Apple</Type>
   ...:         <Type>Pear</Type>
   ...:     </Fruit>
   ...: </Fruits>"""


In [2]: import lxml.etree as et

In [3]: root = et.fromstring(xml)

In [4]: for node in root.xpath("//Fruit[Explanation]"):
   ...:         print(node.xpath("Family/text()"))
   ...:     
['Citrus']

Upvotes: 2

Related Questions