Vismark Juarez
Vismark Juarez

Reputation: 773

Python: How to check if an XML string is valid XML

I am writing a program that will receive a string via stdin. The String will always have the root node <Exam>, and two child nodes: <Question> and <Answer>. What's a function that will validate that the XML is properly formatted (is not missing any tags or angled brackets)?

I've tried using etree but am running into errors:

def isProperlyFormattedXML():
    parser = etree.XMLParser(dtd_validation=True)
    schema_root = etree.XML('''\
        <Exam>
            <Question type="Short Response">
                What does OOP stand for?
            </Question>
            <Answer type="Short Response">
                "Object Oriented programming"
            </Answer>
        </Exam>
        ''')
    schema = etree.XMLSchema(schema_root)

    #Good xml
    parser = etree.XMLParser(schema = schema)
    try:
        root = etree.fromstring("<a>5</a>", parser)
        print ("Finished validating good xml")

        return True
    except lxml.etree.XMLSyntaxError as err:
        print (err)

    #Bad xml
    parser = etree.XMLParser(schema = schema)
    try:
        root = etree.fromstring("<a>5<b>foobar</b></a>", parser)
    except lxml.etree.XMLSyntaxError as err:
        print (err)
        return False

Error:

lxml.etree.XMLSchemaParseError: The XML document 'in_memory_buffer' is not a schema document.```

Upvotes: 3

Views: 6220

Answers (1)

JakobDev
JakobDev

Reputation: 142

You have the solution already. You have to use try/except to check that.

Upvotes: 2

Related Questions