Reputation: 1685
I have the following string which is part of an bigger XML Document:
content = '<odvNameElem stopID="9001002"><itdMapItemList/>Rathaus</odvNameElem>'
And I want to access Rathaus
. My current approach is to parse it with lxml and trying to access the text of the element 'odvNameElem':
from lxml import etree
content = '<odvNameElem stopID="9001002"><itdMapItemList/>Rathaus</odvNameElem>'
root = etree.fromstring(content)
print(root.text)
This however results in None. What am I doing wrong?
etree.__version__ = '4.2.5'
I am not sure why the following works:
root.xpath("string()")
but root.xpath("//text()")
only returns an empty list. Can somebody please explain this?
Upvotes: 2
Views: 1205
Reputation: 50957
The "Rathaus" string is the value of the tail
property of the itdMapItemList
element. Examples:
root.xpath("itdMapItemList")[0].tail
root.find("itdMapItemList").tail
See https://lxml.de/tutorial.html#elements-contain-text.
root.xpath("string()")
returns the concatenation of the string values of the root node and its descendants, which indeed is "Rathaus" in this case.
See https://www.w3.org/TR/xpath-10/#function-string.
root.xpath("//test")
does not make sense (there is no test
element). Did you mean root.xpath("//text()")
?
root.xpath("//text()")
returns a list of all text nodes, which in this case is ['Rathaus']
.
If the input XML is changed to
<odvNameElem stopID="9001002">ABC<itdMapItemList/>Rathaus</odvNameElem>
then the result is ['ABC', 'Rathaus']
Upvotes: 2