Reputation: 49
I have this XML file:
<?xml version="1.0" ?><XMLSchemaPalletLoadTechData xmlns="http://tempuri.org/XMLSchemaPalletLoadTechData.xsd">
<TechDataParams>
<RunNumber>sample</RunNumber>
<Holder>sample</Holder>
<ProcessToolName>sample</ProcessToolName>
<RecipeName>sample</RecipeName>
<PalletName>sample</PalletName>
<PalletPosition>sample</PalletPosition>
<IsControl>sample</IsControl>
<LoadPosition>sample</LoadPosition>
<HolderJob>sample</HolderJob>
<IsSPC>sample</IsSPC>
<MeasurementType>sample</MeasurementType>
</TechDataParams>
<TechDataParams>
<RunNumber>sample</RunNumber>
<Holder>sample</Holder>
<ProcessToolName>sample</ProcessToolName>
<RecipeName>sample</RecipeName>
<PalletName>sample</PalletName>
<PalletPosition>sample</PalletPosition>
<IsControl>sample</IsControl>
<LoadPosition>sample</LoadPosition>
<HolderJob>sample</HolderJob>
<IsSPC>sample</IsSPC>
<MeasurementType>XRF</MeasurementType>
</TechDataParams>
</XMLSchemaPalletLoadTechData>
And this is my code for parsing the xml:
for data in xml.getElementsByTagName('TechDataParams'):
#parse xml
runnum=data.getElementsByTagName('RunNumber')[0].firstChild.nodeValue
hold=data.getElementsByTagName('Holder')[0].firstChild.nodeValue
processtn=data.getElementsByTagName('ProcessToolName'[0].firstChild.nodeValue)
recipedata=data.getElementsByTagName('RecipeName'[0].firstChild.nodeValue)
palletna=data.getElementsByTagName('PalletName')[0].firstChild.nodeValue
palletposi=data.getElementsByTagName('PalletPosition')[0].firstChild.nodeValue
control = data.getElementsByTagName('IsControl')[0].firstChild.nodeValue
loadpos=data.getElementsByTagName('LoadPosition')[0].firstChild.nodeValue
holderjob=data.getElementsByTagName('HolderJob')[0].firstChild.nodeValue
spc = data.getElementsByTagName('IsSPC')[0].firstChild.nodeValue
mestype = data.getElementsByTagName('MeasurementType')[0].firstChild.nodeValue
but when i print each node, i am only getting one set of 'TechDataParams', but I want to be able to get all 'TechDataParams' from the XML.
Let me know if my question is a bit unclear.
Upvotes: 1
Views: 644
Reputation: 10223
Also by lxml.etree
module.
http://tempuri.org/XMLSchemaPalletLoadTechData.xsd
xpath
method to find target TechDataParams
tags.TechDataParams
tag and create dictionary which key
is tag name
and value
is text of tag
.TechDataParams
.code:
from lxml import etree
root = etree.fromstring(content)
TechDataParams_info = []
for i in root.xpath("//a:XMLSchemaPalletLoadTechData/a:TechDataParams", namespaces={"a": 'http://tempuri.org/XMLSchemaPalletLoadTechData.xsd'}):
temp = dict()
for j in i.getchildren():
temp[j.tag.split("}", 1)[-1]] = j.text
TechDataParams_info.append(temp)
print TechDataParams_info
output:
[{'PalletPosition': 'sample', 'HolderJob': 'sample', 'RunNumber': 'sample', 'ProcessToolName': 'sample', 'RecipeName': 'sample', 'IsControl': 'sample', 'PalletName': 'sample', 'LoadPosition': 'sample', 'MeasurementType': 'sample', 'Holder': 'sample', 'IsSPC': 'sample'}, {'PalletPosition': 'sample', 'HolderJob': 'sample', 'RunNumber': 'sample', 'ProcessToolName': 'sample', 'RecipeName': 'sample', 'IsControl': 'sample', 'PalletName': 'sample', 'LoadPosition': 'sample', 'MeasurementType': 'XRF', 'Holder': 'sample', 'IsSPC': 'sample'}]
Upvotes: 0
Reputation: 474171
Please don't dive into parsing XML with minidom
, unless you want your hair to be pulled out by yourself.
I would use xmltodict
module here. One line and you have a list of dicts with all the data you need:
import xmltodict
data = """your xml here"""
data = xmltodict.parse(data)['XMLSchemaPalletLoadTechData']['TechDataParams']
for params in data:
print dict(params)
Prints:
{u'PalletPosition': u'sample', u'HolderJob': u'sample', u'RunNumber': u'sample', u'ProcessToolName': u'sample', u'RecipeName': u'sample', u'IsControl': u'sample', u'PalletName': u'sample', u'LoadPosition': u'sample', u'MeasurementType': u'sample', u'Holder': u'sample', u'IsSPC': u'sample'}
{u'PalletPosition': u'sample', u'HolderJob': u'sample', u'RunNumber': u'sample', u'ProcessToolName': u'sample', u'RecipeName': u'sample', u'IsControl': u'sample', u'PalletName': u'sample', u'LoadPosition': u'sample', u'MeasurementType': u'XRF', u'Holder': u'sample', u'IsSPC': u'sample'}
Upvotes: 1
Reputation: 4912
Here is an example for you. Replace file_path
with your own.
I replace value of RunNumber
with 001
and 002
.
# -*- coding: utf-8 -*-
#!/usr/bin/python
from xml.dom import minidom
file_path = 'C:\\temp\\test.xml'
doc = minidom.parse(file_path)
TechDataParams = doc.getElementsByTagName('TechDataParams')
for t in TechDataParams:
num = t.getElementsByTagName('RunNumber')[0]
print 'num is ', num.firstChild.data
OUTPUT:
num is 001
num is 002
Upvotes: 0