cceaton01
cceaton01

Reputation: 23

Getting XML attributes from XML with namespaces and Python (lxml)

I'm trying to grab the "id" and "href" attributes from the below XML. Thus far I can't seem to get my head around the namespacing aspects. I can get things easily enough with XML that doesn't have namespace references. But this has befuddled me. Any ideas would be appreciated!

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<ns3:searchResult total="1" xmlns:ns5="ers.ise.cisco.com" xmlns:ers-v2="ers-    v2" xmlns:xs="http://www.w3.org/2001/XMLSchema"  xmlns:ns3="v2.ers.ise.cisco.com">
    <ns3:resources>
            <ns5:resource id="d28b5080-587a-11e8-b043-d8b1906198a4"name="00:1B:4F:32:27:50">
        <link rel="self" href="https://ho-lab-ise1:9060/ers/config/endpoint/d28b5080-587a-11e8-b043-d8b1906198a4"type="application/xml"/>
    </ns5:resource>
</ns3:resources>

Upvotes: 2

Views: 718

Answers (2)

Daniel Haley
Daniel Haley

Reputation: 52888

@laurent-laporte's answer is great for showing how to handle multiple namespaces (+1).

However if you truly only need to select a couple of attributes no matter what namespace they're in, you can test local-name() in a predicate...

from lxml import etree

tree = etree.parse('your.xml')

attrs = tree.xpath("//@*[local-name()='id' or local-name()='href']")

for attr in attrs:
    print(attr)

This will print (the same as Laurent's)...

d28b5080-587a-11e8-b043-d8b1906198a4
https://ho-lab-ise1:9060/ers/config/endpoint/d28b5080-587a-11e8-b043-d8b1906198a4

Upvotes: 0

Laurent LAPORTE
Laurent LAPORTE

Reputation: 23002

You can use xpath function to search all resources and iterate on them. The function has a namespaces keyword argument. The can use it to declare the mapping between namespace prefixes and namespace URL.

Here is the idea:

from lxml import etree

NS = {
    "ns5": "ers.ise.cisco.com",
    "ns3": "v2.ers.ise.cisco.com"
}

tree = etree.parse('your.xml')

resources = tree.xpath('//ns5:resource', namespaces=NS)

for resource in resources:
    print(resource.attrib['id'])
    links = resource.xpath('link')
    for link in links:
        print(link.attrib['href'])

sorry, this is not tested

Here is the documentation about xpath.

Upvotes: 1

Related Questions