Reputation: 35
I need to start with saying I'm totally new to writing code. I have been trying to grab information from a xml-file, unsuccessfully I might add. A smal snippet from the xml file is as follows:
<?xml version="1.0"?>
<AlertRequestType xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<DateTime xmlns="http://EU/Common/20181/">2021-06-15T08:55:08.441</DateTime>
<Code xmlns=>A68</Code>
<UniqueAlertId xmlns="http://EU/20181/">US-8I2-NVH-7JH-0A1-
54M</UniqueAlertId>
<Message xmlns="http://EU/Common/20181/">B-Id mismatch.</Message>
<Source xmlns="http://EU/2556781/">National S I</Source>
<SupportingData xmlns="http://EMVS.EU/Common/20181/">
<Item key="errorcode" value="A68" />
<Item key="errormessage" value="B-Id mismatch." />
<Item key="date" value="2021-06-15" />
<Item key="time" value="21:35:03" />
<Item key="uniquealertid" value="US-8I2-NVH-7JH-0A1-54M" />
<Item key="productcode" value="988356696047773" />
<Item key="serialnumber" value="PFL72KBN85S22" />
<Item key="b-id" value="QD88223402+G+1332" />
</SupportingData>
</AlertRequestType>
Now, my question as a person with very poor understanding for ElementTree and coding in general: How can I grab a value from a specific "<Item key="? For example from the item key errorcode with the value of A68. The real kicker is that all the values will change as i will use these values for work with different files every day (the item key attributes such as date or errorcode will never change, just the values of them), so I cant just code to search for one specific value, it needs to grab the value from that exact xpath everytime.
Down below is the code I've tried to modify to fit my needs, but alas to no success.
import xml.etree.ElementTree as ET
import os
xmlfile = 'xmltest.xml'
fullfile = os.path.abspath(os.path.join('filer', xmlfile))
tree = ET.parse(fullfile)
root = tree.getroot()
ET.dump(tree)
for elm in root.findall("./SupportingData/Item key/errorcode[@value=]"):
print(elm.attrib)
Again, this code is from someone completely new to coding. If someone could help me with this I'd be eternally gratefull!
Upvotes: 1
Views: 127
Reputation: 6826
The XML in your question couldn't possibly have been parsed - I'm assuming the same XML as @Bruno shows. Next time you post a question here make sure the data (and code) in your question works.
Minidom might be one way of solving your immediate problem, but in general I'd say ElementTree has better xml support, although if you need more complex xpath then lxml or other libraries are better.
Anyway, to solve your specific problem, the first reason your xpath doesn't work is because your xml uses namespaces.
Specifically for your xml this line specifies that tags below here are in the namespace http://EMVS.EU/Common/20181/
<SupportingData xmlns="http://EMVS.EU/Common/20181/">
The second reason your xpath couldn't possibly work is that Item key/errorcode[@value=]
isn't correct xpath syntax - this should be Item[@key='errorcode']
but the namespace problem means you haven't got to the point where this could fail to match or maybe cause an exception.
So your xpath needs to include the namespace in {} for tags or it won't match. This works:
for elm in root.findall("./{http://EMVS.EU/Common/20181/}SupportingData/{http://EMVS.EU/Common/20181/}Item[@key='errorcode']"):
print(elm)
print(elm.attrib)
When namespaces are being used it can be difficult to get the xpath string to work. And when you start by trying to match a long series of tags/attributes, you don't know which bit isn't matching. My approach is pretty simple: start by getting the first xpath section to match, i.e.:
for elm in root.findall("./{http://EMVS.EU/Common/20181/}SupportingData"):
check it works - no point adding more until this first xpath works, then add the next match, check it works, add the next match, etc. This way when the xpath doesn't match it was the section you just added that's the problem.
There are other ways of matching namespaces, such as providing a namespace dictionary and using this like findall('role:character', ns)
- there are examples in the ElementTree documentation https://python.readthedocs.io/en/stable/library/xml.etree.elementtree.html
Another approach which can be very convenient if you're not going to write the XML out AND there are no tags used in more than one namespace is to simply strip all namespaces from tags and also possibly from attributes. See examples of how to do this by @nonagon and myself here Python ElementTree module: How to ignore the namespace of XML files to locate matching element when using the method "find", "findall"
Upvotes: 1
Reputation: 1016
Firstly we need to fix the XML provided as an example:
<?xml version="1.0"?>
<AlertRequestType xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<DateTime xmlns="http://EU/Common/20181/">2021-06-15T08:55:08.441</DateTime>
<Code xmlns="">A68</Code> <!-- Closing xmlns attribute -->
<UniqueAlertId xmlns="http://EU/20181/">US-8I2-NVH-7JH-0A1-
54M</UniqueAlertId>
<Message xmlns="http://EU/Common/20181/">B-Id mismatch.</Message>
<Source xmlns="http://EU/2556781/">National S I</Source>
<SupportingData xmlns="http://EMVS.EU/Common/20181/">
<Item key="errorcode" value="A68" />
<Item key="errormessage" value="B-Id mismatch." />
<Item key="date" value="2021-06-15" />
<Item key="time" value="21:35:03" />
<Item key="uniquealertid" value="US-8I2-NVH-7JH-0A1-54M" />
<Item key="productcode" value="988356696047773" />
<Item key="serialnumber" value="PFL72KBN85S22" />
<Item key="b-id" value="QD88223402+G+1332" />
</SupportingData>
</AlertRequestType>
Secondly I'd suggest you to use a simpler library, the code below does exactly what you wanted:
import os
import xml.dom.minidom
if __name__ == "__main__":
xmlfile = 'xmltest.xml'
fullfile = os.path.abspath(os.path.join('filer', xmlfile))
doc = xml.dom.minidom.parse(fullfile)
items = doc.getElementsByTagName("Item")
for i in items:
print("Key:" + i.getAttribute("key"))
print("Value:" + i.getAttribute("value"))
The output is:
Key:errorcode
Value:A68
Key:errormessage
Value:B-Id mismatch.
Key:date
Value:2021-06-15
Key:time
Value:21:35:03
Key:uniquealertid
Value:US-8I2-NVH-7JH-0A1-54M
Key:productcode
Value:988356696047773
Key:serialnumber
Value:PFL72KBN85S22
Key:b-id
Value:QD88223402+G+1332
I've given the solution which is specific for your question, but I'd advise you to create a Class named XMLReader and within it you'd add all the XML operations that you want to have.
Upvotes: 0