Noor
Noor

Reputation: 20150

lxml remove element not working

I'm trying to remove XML element using lxml, the methods seems ok but its not working. thats my code:

import lxml.etree as le
f = open('Bird.rdf','r')
doc=le.parse(f)
for elem in doc.xpath("//*[local-name() = 'dc' and namespace-uri() = 'http://purl.org/dc/terms/']"):
    parent=elem.getparent().remove(elem)
print(le.tostring(doc))

Sample XML FIle:

<rdf:RDF xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/terms/"> 

        <wo:Class rdf:about="/nature/life/Bird#class">
                    <dc:description>Birds are a class of vertebrates. They are bipedal, warm-blooded, have a
                        covering of feathers, and their front limbs are modified into wings. Some birds, such as
                        penguins and ostriches, have lost the power of flight. All birds lay eggs. Because birds
                        are warm-blooded, their eggs have to be incubated to keep the embryos inside warm, or
                        they will perish</dc:description>
        </wo:Class>
</rdf:RDF>                  

Upvotes: 0

Views: 931

Answers (1)

tdelaney
tdelaney

Reputation: 77347

Your problem is that local-name is 'description', not 'dc' (the namespace alias). You can pass your namespaces to the xpath function and write your xpath more directly as in:

import lxml.etree as le

txt="""<rdf:RDF xmlns:rdf="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/terms/"
    xmlns:wo="http:/some/wo/namespace">

    <wo:Class rdf:about="/nature/life/Bird#class">
       <dc:description>Birds are a class of vertebrates. They are bipedal, warm-blooded, have a
                        covering of feathers, and their front limbs are modified into wings. Some birds, such as
                        penguins and ostriches, have lost the power of flight. All birds lay eggs. Because birds
                        are warm-blooded, their eggs have to be incubated to keep the embryos inside warm, or
                        they will perish</dc:description>
    </wo:Class>
</rdf:RDF>
"""

namespaces = { 
    "rdf":"http://www.w3.org/2000/01/rdf-schema#",
    "dc":"http://purl.org/dc/terms/",
    "wo":"http:/some/wo/namespace" }

doc=le.fromstring(txt)
for elem in doc.xpath("//dc:description", namespaces=namespaces):
    parent=elem.getparent().remove(elem)
print(le.tostring(doc))

Upvotes: 4

Related Questions