M. W.
M. W.

Reputation: 3

.text output from XML document with xml.etree.ElementTree

I'm trying to parse an XML document so that I only get the text inside the tag , but when I test-print the node, it's only showing square brackets, which means that my command print(rede.text) returns "AttributeError: 'list' object has no attribute 'text'". Why is the XML-content stored as a list object and how I can access the text inside the tag?

import os
from xml.etree import ElementTree
file_name = '19008-data.xml'
full_file = os.path.abspath(os.path.join('WP19_Protokolle_2018-2020',file_name))
dom = ElementTree.parse(full_file)
redner = dom.findall('rede')
print(redner)

output: [ ]

import os
from xml.etree import ElementTree
file_name = '19008-data.xml'
full_file = os.path.abspath(os.path.join('WP19_Protokolle_2018-2020',file_name))
dom = ElementTree.parse(full_file)
redner = dom.findall('rede')
print(redner.text)

AttributeError: 'list' object has no attribute 'text'

extract from XML-document

Upvotes: 0

Views: 1195

Answers (2)

marbu
marbu

Reputation: 2021

Why is the XML-content stored as a list object and how I can access the text inside the tag?

XML content is not stored as a list. When we look at documentation of Element.findall() method, we see that it:

Returns a list containing all matching elements in document order.

In your particular case, the method doesn't find anything, so an empty list was returned. And obviously you can't locate text inside element which doesn't exist.

Upvotes: 0

Andrei Gurko
Andrei Gurko

Reputation: 369

In the official documentation https://docs.python.org/3/library/xml.etree.elementtree.html we can see next explanations:

*Element.findall()* finds only elements with a tag which are direct children of the current element. Element.find() finds the first child with a particular tag, and Element.text accesses the element’s text content. Element.get() accesses the element’s attributes:

for country in root.findall('country'):
    rank = country.find('rank').text
    name = country.get('name')
    print(name, rank)

Upvotes: 1

Related Questions