Trying_hard
Trying_hard

Reputation: 9501

XML Python Parsing

I am new to trying to parse XML with python I have provided the xml below. I am need to get the following data Instrmt afg="AG" and Qty Typ="FIN" QTYL="149" I need the AG and the 149.

I have tried the following:

from xml.dom import minidom

xmldoc = minidom.parse(test.xml)

batch = xmldoc.getElementsByTagName('Batch')[0]

rpt = batch.getElementsByTagName('PosRpt')

for ag in rpt:
    sym = ag.getElementsByTagName('Instrmt')
    print(sym)

When I do this I get a DOM object and not sure how to get the results I am trying for.

- <XML r="20030517" s="20042209" v="4.4" xr="FIA" xv="1">
- <Batch>
- <PosRpt RptID="175" BizDt="2013-01-03" ReqTyp="0" >
  <Pty ID="Ade" R="21" /> 
- <Pty ID="000" R="4">
  <Sub ID="F" Typ="29" /> 
  </Pty>
  <Instrmt afg="AG" ID="AG" Src="8" CFI="FFI" MMY="2013" Matf="2013"/> 
  <Qty Typ="AOD" QTYL="134" QTYS="0" /> 
  <Qty Typ="FIN" QTYL="149" QTYS="0" /> 
  <Amt Typ="FMTM" Amt="155065.44" /> 
  </PosRpt>
  </Batch>
  </XML>

Upvotes: 3

Views: 533

Answers (2)

Mark Tolonen
Mark Tolonen

Reputation: 177396

Take a look at ElementTree and XPATH specifications:

from xml.etree import ElementTree as et

data = '''\
<XML r="20030517" s="20042209" v="4.4" xr="FIA" xv="1">
- <Batch>
- <PosRpt RptID="175" BizDt="2013-01-03" ReqTyp="0" >
  <Pty ID="Ade" R="21" /> 
- <Pty ID="000" R="4">
  <Sub ID="F" Typ="29" /> 
  </Pty>
  <Instrmt afg="AG" ID="AG" Src="8" CFI="FFI" MMY="2013" Matf="2013"/> 
  <Qty Typ="AOD" QTYL="134" QTYS="0" /> 
  <Qty Typ="FIN" QTYL="149" QTYS="0" /> 
  <Amt Typ="FMTM" Amt="155065.44" /> 
  </PosRpt>
  </Batch>
  </XML>
'''

#tree = et.parse('test.xml')
tree = et.fromstring(data)

# Find the first Instrmt node anywhere in the tree
print(tree.find('.//Instrmt').attrib['afg'])

# Find a Qty node with a particular attribute.
print(tree.find(".//Qty[@Typ='FIN']").attrib['QTYL'])

Output:

AG
149

Upvotes: 0

piokuc
piokuc

Reputation: 26164

To extract values of an attribute use elt.getAttribute("attribute_name"), for example:

print(sym.getAttribute("afg"), sym.getAttribute("ID"))

In your case sym is still a node list, not a node (tag), so you can access particular elements of the list like this, for example:

sym = ag.getElementsByTagName('Instrmt')
for e in sym:
    print e.getAttribute("afg")

Or just:

print sym[0].getAttribute("afg")

if you know there is only one element on the list.

You can check what your tag is with an expression like:

e.tagName == 'Instrmt'

Upvotes: 1

Related Questions