peizidiao
peizidiao

Reputation: 15

End parsing xml early in sax

When I use xml.sax to parse the xml file, I want to stop the parser at any moment or until the program meets certain condition. Please not exit or throw a exception, because I want the program to continue running.

As commented in 'endElement', how do I finish parsing in advance and print e.g. 'hello' in sax_parse?

class myHandler(sax.ContentHandler):
    def __init__(self):
        super().__init__()
        self.count = 0
        
    def startElement(self, name, attrs):
        if name == 'row':
            self.count += 1

    def endElement(self, name):
        if self.count > 10000:
            # stop the parser
            pass

def sax_parse():
    parser = sax.make_parser()
    parser.setFeature(sax.handler.feature_namespaces, 0)
    handler = myHandler()
    parser.setContentHandler(handler)
    parser.parse("url")
    print('hello')

Upvotes: 1

Views: 240

Answers (1)

balderman
balderman

Reputation: 23815

The below code should work for you.

The idea is to raise a specific exception, to catch it and ignore it ..

class HaveEnoughRows(Exception):
    pass

class myHandler(sax.ContentHandler):
    def __init__(self):
        super().__init__()
        self.count = 0
        
    def startElement(self, name, attrs):
        if name == 'row':
            self.count += 1

    def endElement(self, name):
        if self.count > 10000:
            # stop the parser
            raise HaveEnoughRows()

def sax_parse():
    parser = sax.make_parser()
    parser.setFeature(sax.handler.feature_namespaces, 0)
    handler = myHandler()
    parser.setContentHandler(handler)
    try:
        parser.parse("url")
    except HaveEnoughRows:
        pass
    print('hello')

Upvotes: 2

Related Questions