GhostKU
GhostKU

Reputation: 2108

How can I get XML declaration string with lxml

I use lxml to parse XML document How can I get declaration string?

 <?xml version="1.0" encoding="utf-8" ?> 

I want to check if it is present, what encoding it has and what xml version.

Upvotes: 3

Views: 1730

Answers (2)

Robert
Robert

Reputation: 95

Maybe you should just check if a string with that declaration value ( ) can be found in your XML file:

    def matchLine(path, line_number, text):
        """
        path = used for defining the file to be checked
        line_number = used to identify the line that  will be checked
        text = string containing the text to match
        """
        file = open(path)
        line_file = file.readline()
        line_file = line_file.rstrip()
        line_no = 1
        while line_file != "":
            if line_no == line_number:
                if line_file == text:
                    return True
                else:
                    return False
            line_no = line_no+1
            line_file = file.readline()
            line_file = line_file.rstrip()

Upvotes: 0

301_Moved_Permanently
301_Moved_Permanently

Reputation: 4186

When parsing your document, the resulting ElementTree object should have a DocInfo object that contains information about the XML or HTML document parsed.

For XML, you may be interested in the xml_version and encoding attributes of this DocInfo:

>>> from lxml import etree
>>> tree = etree.parse('input.xml')
>>> tree.docinfo
<lxml.etree.DocInfo object at 0x7f8111f9ecc0>
>>> tree.docinfo.xml_version
'1.0'
>>> tree.docinfo.encoding
'UTF-8'

Upvotes: 4

Related Questions