Get inner xml from lxml

Question

I have the following string which is part of an bigger XML Document:

content = 'Rathaus'

And I want to access Rathaus. My current approach is to parse it with lxml and trying to access the text of the element 'odvNameElem':

from lxml import etree
content = 'Rathaus'
root = etree.fromstring(content)
print(root.text)

This however results in None. What am I doing wrong?

etree.__version__ = '4.2.5'

I am not sure why the following works: root.xpath("string()") but root.xpath("//text()") only returns an empty list. Can somebody please explain this?

mzjn · Accepted Answer

The "Rathaus" string is the value of the tail property of the itdMapItemList element. Examples:

root.xpath("itdMapItemList")[0].tail
root.find("itdMapItemList").tail

See https://lxml.de/tutorial.html#elements-contain-text.

root.xpath("string()") returns the concatenation of the string values of the root node and its descendants, which indeed is "Rathaus" in this case.

See https://www.w3.org/TR/xpath-10/#function-string.

root.xpath("//test") does not make sense (there is no test element). Did you mean root.xpath("//text()")?

root.xpath("//text()") returns a list of all text nodes, which in this case is ['Rathaus'].

If the input XML is changed to

ABCRathaus

then the result is ['ABC', 'Rathaus']

Get inner xml from lxml

Answers (1)

Related Questions