Reputation: 41
I'm trying to create a XML file for the word reference source file which is in XML. When I write to the file, with only "xml_decaration=True" it shows <?xml version='1.0' encoding='us-ascii'?>
but I want it in the form <?xml version="1.0"?>
.
from xml.etree.ElementTree import ElementTree
from xml.etree.ElementTree import Element
import xml.etree.ElementTree as ET
import uuid
from lxml import etree
root=Element('b:sources')
root.set('SelectedStyle','')
root.set('xmlns:b','http://schemas.openxmlformats.org/officeDocument/2006/bibliography')
root.set('xmlns','http://schemas.openxmlformats.org/officeDocument/2006/bibliography')
#root.attrib=('SelectedStyle'='', 'xmlns:b'='"http://schemas.openxmlformats.org/officeDocument/2006/bibliography"', 'xmlns:b'='"http://schemas.openxmlformats.org/officeDocument/2006/bibliography"','xmlns'='"http://schemas.openxmlformats.org/officeDocument/2006/bibliography"')
source=ET.SubElement(root, 'b:source')
ET.SubElement(source,'b:Tag')
ET.SubElement(source,'b:SourceType').text='Misc'
ET.SubElement(source,'b:guid').text=str(uuid.uuid1())
Author=ET.SubElement(source,'b:Author')
Author2=ET.SubElement(Author,'b:Author')
ET.SubElement(Author2,'b:Corporate').text='Norsk olje og gass'
ET.SubElement(source, 'b:Title').text='R-002'
ET.SubElement(source, 'b:Year').text='2019'
ET.SubElement(source, 'b:Month').text='10'
ET.SubElement(source, 'b:Day').text='27'
tree=ElementTree(root)
tree.write('Sources.xml', xml_declaration=True, method='xml')
Upvotes: 4
Views: 4545
Reputation: 795
When using xml.etree.ElementTree
there is no way to avoid the inclusion of an encoding attribute in the declaration. If you don't want an encoding attribute in the XML declaration at all, you need to use xml.dom.minidom
not xml.etree.ElementTree
.
Here is a snippet to setup an example:
import xml.etree.ElementTree
a = xml.etree.ElementTree.Element('a')
tree = xml.etree.ElementTree.ElementTree(element=a)
root = tree.getroot()
out = xml.etree.ElementTree.tostring(root, xml_declaration=True)
b"<?xml version='1.0' encoding='us-ascii'?>\n<a />"
us-ascii
:out = xml.etree.ElementTree.tostring(root, encoding='us-ascii', xml_declaration=True)
b"<?xml version='1.0' encoding='us-ascii'?>\n<a />"
unicode
:out = xml.etree.ElementTree.tostring(root, encoding='unicode', xml_declaration=True)
"<?xml version='1.0' encoding='UTF-8'?>\n<a />"
minidom
:Let's take the first example from above with the encoding omitted and use the variable out
as the input to xml.dom.minidom
and you will see the output that you're seeking.
import xml.dom.minidom
dom = xml.dom.minidom.parseString(out)
dom.toxml()
'<?xml version="1.0" ?><a/>'
There is also a pretty print option:
dom.toprettyxml()
'<?xml version="1.0" ?>\n<a/>\n'
Take a look at the source code, and you can see that the encoding is hard coded in the output.
with _get_writer(file_or_filename, encoding) as (write, declared_encoding):
if method == "xml" and (xml_declaration or
(xml_declaration is None and
declared_encoding.lower() not in ("utf-8", "us-ascii"))):
write("<?xml version='1.0' encoding='%s'?>\n" % (
declared_encoding,))
Upvotes: 2