zeno tqjoli
zeno tqjoli

Reputation: 31

Create XML files which have tags with prefix (python and lxml)

I try to create an XML file like this:

<pico:record xsi:schemaLocation="http://purl.org/pico/1.0/ http://www.culturaitalia.it/pico/schemas/1.0/pico.xsd>
    <dc:identifier>work_3117</dc:identifier>
</pico:record>

I use this code:

from lxml import etree 
xsi="http://www.w3.org/2001/XMLSchema-instance"
schemaLocation="http://purl.org/pico/1.0/ http://www.culturaitalia.it/pico/schemas/1.0/pico.xsd"
ns = "{xsi}"
root=etree.Element("pico:record", attrib={"{" + xsi + "}schemaLocation" : schemaLocation})
etree.SubElement(root, "dc:identifier").text = "work_3117"


print(etree.tostring(root, pretty_print=True))

The result is not working, python tells me that:

ValueError: Invalid tag name u'pico:record'

If I change 'pico:recors' with 'record' the error is:

ValueError: Invalid tag name u'dc:identifier'

Upvotes: 3

Views: 3697

Answers (2)

You Help Me Help You
You Help Me Help You

Reputation: 101

There was a small glitch on line 6 in GHajba's code. Fixed it as below.

xsi="http://www.w3.org/2001/XMLSchema-instance"
schemaLocation="http://purl.org/pico/1.0/ http://www.culturaitalia.it/pico/schemas/1.0/pico.xsd"
pico = "http://purl.org/pico/1.0/"
dc = "http://purl.org/dc/elements/1.1/"
ns = {"xsi": xsi, "dc": dc, "pico": pico}
root=etree.Element("{" + pico + "}record", attrib={"{" + xsi + "}schemaLocation" : schemaLocation}, nsmap=ns)
etree.SubElement(root, "{" + dc + "}" + "identifier").text = "work_3117"
print etree.tostring(root, pretty_print=True)

Upvotes: 0

GHajba
GHajba

Reputation: 3691

OK, the question is a bit old but I ran into the same problem today.

You need to provide the namespace of "dc" to the generation and the same goes for "pico" too. And you have to make lxml be aware of this namespace. You can do this with a namespace map which you provide when you create the root element:

from lxml import etree
xsi="http://www.w3.org/2001/XMLSchema-instance"
schemaLocation="http://purl.org/pico/1.0/ http://www.culturaitalia.it/pico/schemas/1.0/pico.xsd"
pico = "http://purl.org/pico/1.0/"
dc = "http://purl.org/dc/elements/1.1/"
ns = {"xsi": xsi, "dc": dc, "pico": schemalocation}
root=etree.Element("{" + pico + "}record", attrib={"{" + xsi + "}schemaLocation" : schemaLocation}, nsmap=ns)
etree.SubElement(root, "{" + dc + "}" + "identifier").text = "work_3117"
print etree.tostring(root, pretty_print=True)

And the result is:

<pico:record xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:pico="http://purl.org/pico/1.0/" xsi:schemaLocation="http://purl.org/pico/1.0/ http://www.culturaitalia.it/pico/schemas/1.0/pico.xsd">
  <dc:identifier>work_3117</dc:identifier>
</pico:record>

For more details see: http://lxml.de/tutorial.html#namespaces

Upvotes: 4

Related Questions