Reputation: 111
I have this xml:
<?xml version="1.0" encoding="utf-8" ?>
<ArrayOfEMObject2 xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.blue-order.com/ma/essencemanagerws/EssenceManager">
<EMObject2>
<emguid>727ef486-31b3-48c3-b38e-39561995ef80</emguid>
<orgname>2435e6b6-e19a-4ca5-a708-47f7d9387bb9.wav</orgname>
<streamclass>AUDIO</streamclass>
<streamtype>WAV</streamtype>
<prefusage>BROWSE</prefusage>
</EMObject2>
<EMObject2>
<emguid>e866abef-7571-45a7-84be-85f2ffc35b31</emguid>
<orgname>201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3</orgname>
<streamclass>AUDIO</streamclass>
<streamtype>MP3</streamtype>
<prefusage>AUX</prefusage>
</EMObject2>
<EMObject2>
<emguid>f02ab3db-93c8-4cbf-82b8-5fb06704a4ea</emguid>
<orgname>201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3</orgname>
<streamclass>AUDIO</streamclass>
<streamtype>MP3</streamtype>
<prefusage>AUX</prefusage>
</EMObject2>
If the streamtype
is MP3
, I need the corresponding emguid
and orgname
.
I already have this:
from xml.etree import ElementTree
# ...
namespace = '{http://www.blue-order.com/ma/essencemanagerws/EssenceManager}'
for child in root.findall('.//{}streamtype'.format(namespace)):
if child.text == 'MP3':
How should I proceed here?
Upvotes: 0
Views: 53
Reputation: 46
You can find and check the streamtype tag and then retrieve the other information like this:
from xml.etree import ElementTree
# ...
namespace = '{http://www.blue-order.com/ma/essencemanagerws/EssenceManager}'
for child in root.findall('.//{}EMObject2'.format(namespace)):
if child.find('{}streamtype'.format(namespace)).text == 'MP3':
print(child.find('{}emguid'.format(namespace)).text)
print(child.find('{}orgname'.format(namespace)).text)
Upvotes: 1
Reputation: 189297
Here's an attempt which instead seeks the EMObject2
instances and checks their children.
namespace = '{http://www.blue-order.com/ma/essencemanagerws/EssenceManager}'
tags = {'{}{}'.format(namespace, tag): tag
for tag in ('orgname', 'streamtype', 'emguid')}
for node in root.findall('.//{}EMObject2'.format(namespace)):
match = dict()
for child in node:
if child.tag in tags:
match[tags[child.tag]] = child.text
try:
if match['streamtype'] == 'MP3':
print(match['orgname'], match['emguid'])
except KeyError:
pass
(I had to repair your XML by adding a closing tag to get this to run.)
Upvotes: 0
Reputation: 331
Try this.
from simplified_scrapy import SimplifiedDoc,utils
xml = '''
<?xml version="1.0" encoding="utf-8" ?>
<ArrayOfEMObject2 xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.blue-order.com/ma/essencemanagerws/EssenceManager">
<EMObject2>
<emguid>727ef486-31b3-48c3-b38e-39561995ef80</emguid>
<orgname>2435e6b6-e19a-4ca5-a708-47f7d9387bb9.wav</orgname>
<streamclass>AUDIO</streamclass>
<streamtype>WAV</streamtype>
<prefusage>BROWSE</prefusage>
</EMObject2>
<EMObject2>
<emguid>e866abef-7571-45a7-84be-85f2ffc35b31</emguid>
<orgname>201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3</orgname>
<streamclass>AUDIO</streamclass>
<streamtype>MP3</streamtype>
<prefusage>AUX</prefusage>
</EMObject2>
<EMObject2>
<emguid>f02ab3db-93c8-4cbf-82b8-5fb06704a4ea</emguid>
<orgname>201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3</orgname>
<streamclass>AUDIO</streamclass>
<streamtype>MP3</streamtype>
<prefusage>AUX</prefusage>
</EMObject2>
'''
doc = SimplifiedDoc(xml)
lst = doc.selects('streamtype').contains('MP3').parent
print ([(l.emguid.text,l.orgname.text) for l in lst])
# Or
lst = doc.selects('EMObject2')
for l in lst:
if l.streamtype.text=='MP3':
print (l.emguid.text,l.orgname.text)
Result:
[('e866abef-7571-45a7-84be-85f2ffc35b31', '201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3'), ('f02ab3db-93c8-4cbf-82b8-5fb06704a4ea', '201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3')]
e866abef-7571-45a7-84be-85f2ffc35b31 201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3
f02ab3db-93c8-4cbf-82b8-5fb06704a4ea 201701191006474010024190133005056B91BF30000003352B00000D0F094671.mp3
Upvotes: 1