Reputation: 331
I am having trouble extracting (name, value) pair from XML where name == 'mykey', using Python's xml.etree.ElementTree library. The pseudo-code I want is:
if (name.text == 'mykey'):
print value.text
Value would be the value of mykey which in this example case is XX111.
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE charles-session SYSTEM "https://www.charlesproxy.com/dtd/charles-session-1_2.dtd">
<charles-session>
<header>
<name>Content-Length</name>
<value>10804</value>
</header>
<header>
<name>Date</name>
<value>Wed, 13 Oct 2021 22:02:42 GMT</value>
</header>
<header>
<name>mykey</name>
<value>XX111</value>
</header>
<header>
<name>Accept-Language</name>
<value>en-US;q=1.0, el-US;q=0.9</value>
</header>
I can get the 'mykey' text printed, but I don't know how to say "now give me the text of the value right after 'mykey'.
for name in root_node.iter('name'):
if re.match(r'mykey', name.text):
print(name.text)
Upvotes: 0
Views: 950
Reputation: 3158
Get the element value and line number this element is at.
Please look at the comment on code for explanation:
# Data setup
from io import StringIO
xml_data="""\
<?xml version="1.0" encoding="UTF-8"?>
<charles-session>
<header>
<name>Content-Length</name>
<value>10804</value>
</header>
<header>
<name>Date</name>
<value>Wed, 13 Oct 2021 22:02:42 GMT</value>
</header>
<header>
<name>mykey</name>
<value>XX111</value>
</header>
<header>
<name>Accept-Language</name>
<value>en-US;q=1.0, el-US;q=0.9</value>
</header>
</charles-session>
"""
now get the element looking for and its value.
# define the root
root = ET.parse(StringIO(xml_data)).getroot()
# based on xpath get the key and its value
for each in root.findall(".//header"):
if "mykey" in each[0].text:
print(each[1].text)
tweaked the code from get line number credit @Duncan Harris
import sys
sys.modules['_elementtree'] = None
import xml.etree.ElementTree as ET
class LineNumberingParser(ET.XMLParser):
def _start(self, *args, **kwargs):
# Here we assume the default XML parser which is expat
# and copy its element position attributes into output Elements
element = super(self.__class__, self)._start(*args, **kwargs)# change for python 3
element._start_line_number = self.parser.CurrentLineNumber
element._start_column_number = self.parser.CurrentColumnNumber
element._start_byte_index = self.parser.CurrentByteIndex
return element
def _end(self, *args, **kwargs):
element = super(self.__class__, self)._end(*args, **kwargs)
element._end_line_number = self.parser.CurrentLineNumber
element._end_column_number = self.parser.CurrentColumnNumber
element._end_byte_index = self.parser.CurrentByteIndex
return element
tree = ET.parse('test.xml', parser=LineNumberingParser())
now track the line number: i.e line number of root + element looking for.
tree_line_number=tree.getroot()._start_line_number
for each in tree.findall(".//header"):
if "mykey" in each[0].text:
print(f"{each[1].text} at line number -> {each._start_line_number+tree_line_number}")
Upvotes: 1