user14308341
user14308341

Reputation:

How to get text of XML tag using XPATH

I have here a simple xml tag and I just wanted to get the text of the attribute that has "en" on it and I wanted to get it base on the attribute value so I am doing it by XPATH

<tu>
      <tuv xml:lang="de"><seg>Samp</seg></tuv>
      <tuv xml:lang="en"><seg>Python is cool!</seg></tuv>
</tu>

as you can see there the attribute for <tuv> tag is xml:lang then I tried this code right here

print(body_xml.find("tu").find('./tuv[@xml:lang="en"]').find("seg").text)

However it gives me this error

 raise SyntaxError("prefix %r not found in prefix map" % prefix) from None
SyntaxError: prefix 'xml' not found in prefix map

The thing also is I tried to remove the xml: and so the "lang" is left then I tried to put it this way print(body_xml.find("tu").find('./tuv[@lang="en"]').find("seg").text) and IT WORKED! Now I am just wondering I think the problem is it's because of this character : so could there be a way to like formally do this correctly? I am very much open for any suggestions here in this great community :) Thank you for helping me fellow programmers!

Upvotes: 0

Views: 288

Answers (1)

balderman
balderman

Reputation: 23815

You need to use namespace when you call find

import xml.etree.ElementTree as ET

xml = '''<tu>
      <tuv xml:lang="de"><seg>Samp</seg></tuv>
      <tuv xml:lang="en"><seg>Python is cool!</seg></tuv>
</tu>'''

nsmap = {"xml": "http://www.w3.org/XML/1998/namespace"}
root = ET.fromstring(xml)
seg_txt = root.find('.//tuv[@xml:lang="en"]', nsmap).find('seg', nsmap).text
print(seg_txt)

output

Python is cool!

Upvotes: 1

Related Questions