user1885868
user1885868

Reputation: 1093

Parse XML using Python and xml.etree

I have an XML file and I'd like to read some data out of it using Python's xml.etree.

Let's say that the XML file is like this :

<a>
   <b>
      <c>
         <d>This is the text I want to retrieve</d>
      </c>
   </b>
</a>

What I did was something like this :

document = ElementTree.parse('file.xml')
dToFind = document.find('d')
print(dToFind.text)

But it gave me the following error :

    print(dToFind.text)
AttributeError: 'NoneType' object has no attribute 'text'

What did I do wrong? And how can I fix it?

Thanks!

Upvotes: 1

Views: 184

Answers (1)

karthikr
karthikr

Reputation: 99680

You can use XPATH for more sophesticated parsing combined with find for finding the node recursively

In this case:

dToFind = document.find('.//d')

The documentation points to more structured parsing using xpath - would encourage you to look into that.

DEMO

>>> from xml.etree import ElementTree as ET
>>> content = """<a>
...    <b>
...       <c>
...          <d>This is the text I want to retrieve</d>
...       </c>
...    </b>
... </a>
... """
>>> 
>>> xx = ET.fromstring(file) #you would be doing .parse()
>>> xx.find('d') #Returns None, as it finds only first level children
>>> xx.find('.//d')
<Element 'd' at 0xf1b550>
>>> d = xx.find('.//d')
>>> d.text
'This is the text I want to retrieve'

Upvotes: 2

Related Questions