Marcin Trofiniak
Marcin Trofiniak

Reputation: 189

Reading xml with lxml lib geting strange string from xmlns tag

I am writing program to work on xml file and change it. But when I try to get to any part of it I get some extra part.

My xml file:

<?xml version="1.0" encoding="UTF-8"?>
<Package xmlns="http://soap.sforce.com/2006/04/metadata">
    <types>
        <members>sbaa__ApprovalChain__c.ExternalID__c</members>
        <members>sbaa__ApprovalCondition__c.ExternalID__c</members>
        <members>sbaa__ApprovalRule__c.ExternalID__c</members>
       <name>CustomField</name>
    </types>
    <version>40.0</version>
</Package>

And I have my code:

from lxml import etree
import sys

tree = etree.parse('package.xml')
root = tree.getroot()
print( root[0][0].tag )

As output I expect to see members but I get something like this:

{http://soap.sforce.com/2006/04/metadata}members

Why do I see that url and how to stop it from showing up?

Upvotes: 1

Views: 72

Answers (1)

pacholik
pacholik

Reputation: 8972

You have defined a default namespace (Wikipedia, lxml tutorial). When defined, it is a part of every child tag.

If you want to print the tag without the namespace, it's easy

tag = root[0][0].tag
print(tag[tag.find('}')+1:])

If you want to remove the namespace from XML, see this question.

Upvotes: 1

Related Questions