Reputation: 11
I am trying to iterate through XML from a Requests response. Right now my python code look as such:
data = requests.post(url, data=xml, headers=headers).content
tree = ElementTree.fromstring(data)
And my XML looks as such:
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<GetPasswordResponse xmlns="https://tempuri.org/">
<GetPasswordResult>
<Content>ThisisContent</Content>
<UserName>ExampleName</UserName>
<Address>ExServer</Address>
<Database>tempdb</Database>
<PolicyID>ExPolicy</PolicyID>
<Properties>
<KeyAndValue>
<key>Content</key>
<value>ThisisContent</value>
</KeyAndValue>
<KeyAndValue>
<key>ReconcileIsWinAccount</key>
<value>Yes</value>
</KeyAndValue>
</Properties>
</GetPasswordResult>
</GetPasswordResponse>
</soap:Body></soap:Envelope>'
How would I go about pulling out the values for the <Content>, <UserName>, and <PolicyID> tags using ElementTree? I have tried many different things but can't seem to get any of the values accessible.
Upvotes: 1
Views: 83
Reputation: 331
There is a library that doesn't need to consider the XML namespace.
from simplified_scrapy import utils, SimplifiedDoc, req
xml = '''
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<GetPasswordResponse xmlns="https://tempuri.org/">
<GetPasswordResult>
<Content>ThisisContent</Content>
<UserName>ExampleName</UserName>
<Address>ExServer</Address>
<Database>tempdb</Database>
<PolicyID>ExPolicy</PolicyID>
<Properties>
<KeyAndValue>
<key>Content</key>
<value>ThisisContent</value>
</KeyAndValue>
<KeyAndValue>
<key>ReconcileIsWinAccount</key>
<value>Yes</value>
</KeyAndValue>
</Properties>
</GetPasswordResult>
</GetPasswordResponse>
</soap:Body></soap:Envelope>
'''
# xml = req.post(url, data=xml, headers=headers)
doc = SimplifiedDoc(xml)
nodes = doc.select('GetPasswordResult').selects('Content|UserName|PolicyID')
print ([(node.tag, node.text) for node in nodes])
Result:
[('Content', 'ThisisContent'), ('UserName', 'ExampleName'), ('PolicyID', 'ExPolicy')]
Upvotes: 0
Reputation:
That's a little tricky since you have elements with a namespace but no prefix.
from xml.etree import ElementTree as ET
data = '''\
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<soap:Body>
<GetPasswordResponse xmlns="https://tempuri.org/">
<GetPasswordResult>
<Content>ThisisContent</Content>
<UserName>ExampleName</UserName>
<Address>ExServer</Address>
<Database>tempdb</Database>
<PolicyID>ExPolicy</PolicyID>
<Properties>
<KeyAndValue>
<key>Content</key>
<value>ThisisContent</value>
</KeyAndValue>
<KeyAndValue>
<key>ReconcileIsWinAccount</key>
<value>Yes</value>
</KeyAndValue>
</Properties>
</GetPasswordResult>
</GetPasswordResponse>
</soap:Body></soap:Envelope>
'''
tree = ET.fromstring(data)
nmsp = {
'soap': 'http://schemas.xmlsoap.org/soap/envelope/',
'x': 'https://tempuri.org/',
} # NAMESPACE PREFIX ASSIGNMENT
print(tree.find('.//x:Content', namespaces=nmsp).text)
print(tree.find('.//x:UserName', namespaces=nmsp).text)
print(tree.find('.//x:PolicyID', namespaces=nmsp).text)
Upvotes: 1