Reputation: 81
I need to parse xml file and find a values only starts with "123". How i can do this using this code below? It is possible to use regex inside this syntax?
import xml.etree.ElementTree as ET
parse = ET.parse('xml.xml')
print([ events.text for record in parse.findall('.configuration/system/') for events in record.findall('events')])
xml.xml
<rpc-reply>
<configuration>
<system>
<preference>
<events>123</events>
<events>124</events>
<events>1235</events>
</preference>
</system>
</configuration>
</rpc-reply>
Upvotes: 0
Views: 3099
Reputation: 89325
XPath predicate can do that much using built-in function starts-with()
. But you need to use library that fully support XPath 1.0 such as lxml
:
from lxml import etree as ET
raw = '''<rpc-reply>
<configuration>
<system>
<preference>
<events>123</events>
<events>124</events>
<events>1235</events>
</preference>
</system>
</configuration>
</rpc-reply>'''
root = ET.fromstring(raw)
query = 'configuration/system/preference/events[starts-with(.,"123")]'
print([events.text for events in root.xpath(query)])
If you still want to use regex, lxml
supports regex despite XPath 1.0 specification does not include regex (see: Regex in lxml for python).
xml.etree
only supports limited subset of XPath 1.0 expression, which does not include starts-with
function (and definitely does not support regex). So you need to rely on python string function to check that:
....
query = 'configuration/system/preference/events'
print([events.text for events in root.findall(query) if events.text.startswith('123')])
Upvotes: 1