Reputation: 2108
I use lxml
to parse XML document
How can I get declaration string?
<?xml version="1.0" encoding="utf-8" ?>
I want to check if it is present, what encoding it has and what xml version.
Upvotes: 3
Views: 1730
Reputation: 95
Maybe you should just check if a string with that declaration value ( ) can be found in your XML file:
def matchLine(path, line_number, text):
"""
path = used for defining the file to be checked
line_number = used to identify the line that will be checked
text = string containing the text to match
"""
file = open(path)
line_file = file.readline()
line_file = line_file.rstrip()
line_no = 1
while line_file != "":
if line_no == line_number:
if line_file == text:
return True
else:
return False
line_no = line_no+1
line_file = file.readline()
line_file = line_file.rstrip()
Upvotes: 0
Reputation: 4186
When parsing your document, the resulting ElementTree
object should have a DocInfo
object that contains information about the XML or HTML document parsed.
For XML, you may be interested in the xml_version
and encoding
attributes of this DocInfo
:
>>> from lxml import etree
>>> tree = etree.parse('input.xml')
>>> tree.docinfo
<lxml.etree.DocInfo object at 0x7f8111f9ecc0>
>>> tree.docinfo.xml_version
'1.0'
>>> tree.docinfo.encoding
'UTF-8'
Upvotes: 4