kanadianDri3
kanadianDri3

Reputation: 371

Validating a XML against a XSD schema and catching validator expection with groovy

def validateXml(xml){

    String xsd = "src/main/ressources/fulltext-documents-v1.2.3.xsd"

    def factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
    def schema = factory.newSchema(new StreamSource(new File(xsd)))
    def validator = schema.newValidator()
    validator.validate(new StreamSource(new StringReader(xml)))
}

This is my function to validate a String representation of a xml document. Below an other function that catches the exceptions that could be raised by the validator

def xmlVerification(xml) {

    Node rootNode = new XmlParser().parseText(xml)
    def stringXml = XmlUtil.serialize(rootNode)

    try{
        validateXml(stringXml)
        println "no error in text"
    }catch(SAXParseException e){
        println "column number "+e.getColumnNumber()
        println "line number"+e.getLineNumber()
    }
}

For now it only shows the column and line number where the exception was raised (Good enough for me at the moment).

Now, lets assume that I have a document with at least 2 errors. What I wish is to get these 2 errors (in a table for example) and then treat them. With my code, it stops on the first exception raised, so I can't handle the 2 errors. I have to correct the first one in order to correct the second one (by re-running my code a second time).

Any idea how I can go through the entire document, stock all the exceptions and treat them in a .each{} loop or something like that ?

Hope it was clear enough

Thanks in advance !

Upvotes: 2

Views: 2984

Answers (1)

tim_yates
tim_yates

Reputation: 171194

This should do what you want:

import org.xml.sax.ErrorHandler
import static javax.xml.XMLConstants.W3C_XML_SCHEMA_NS_URI
import javax.xml.transform.stream.StreamSource
import javax.xml.validation.Schema
import javax.xml.validation.SchemaFactory
import javax.xml.validation.Validator

List findProblems( File xml, File xsd ) {
  SchemaFactory factory = SchemaFactory.newInstance( W3C_XML_SCHEMA_NS_URI )
  Schema schema = factory.newSchema( new StreamSource( xsd ) )
  Validator validator = schema.newValidator()
  List exceptions = []
  Closure<Void> handler = { exception -> exceptions << exception }
  validator.errorHandler = [ warning:    handler,
                             fatalError: handler,
                             error:      handler ] as ErrorHandler
  validator.validate( new StreamSource( xml ) )
  exceptions
}

// Two files I got for testing
File xml = new File( 'books.xml' )
File xsd = new File( 'books.xsd' )

// Call the method, and print out each exception
findProblems( xml, xsd ).each {
  println "Problem @ line $it.lineNumber, col $it.columnNumber : $it.message"
}

Or a slightly more ideomatic groovy version would be:

import org.xml.sax.ErrorHandler
import static javax.xml.XMLConstants.W3C_XML_SCHEMA_NS_URI
import javax.xml.transform.stream.StreamSource
import javax.xml.validation.SchemaFactory

List findProblems( File xml, File xsd ) {
  SchemaFactory.newInstance( W3C_XML_SCHEMA_NS_URI )
               .newSchema( new StreamSource( xsd ) )
               .newValidator().with { validator ->
    List exceptions = []
    Closure<Void> handler = { exception -> exceptions << exception }
    errorHandler = [ warning: handler, fatalError: handler, error: handler ] as ErrorHandler
    validate( new StreamSource( xml ) )
    exceptions
  }
}

Upvotes: 6

Related Questions