bobber205
bobber205

Reputation: 13372

DTD Info and Related Errors when Validating (XSD Schema) -- Can They Be Ignored?

So I've got a large amount of XML files. For years they've caused trouble because the people that write them do them by hand, so errors naturally occurred. It's high time we get around to validating them and providing feedback on what's wrong when trying to use these XML files.

I'm using the SAX parser and getting a list of errors.

Below is my code

  BookValidationErrorHandler errorHandler = new BookValidationErrorHandler();

        SAXParserFactory factory = SAXParserFactory.newInstance();
        factory.setValidating(true);
        factory.setNamespaceAware(true);

        SchemaFactory schemaFactory = 
            SchemaFactory.newInstance("http://www.w3.org/2001/XMLSchema");

        factory.setSchema(schemaFactory.newSchema(
            new Source[] {new StreamSource("test.xsd")}));


        javax.xml.parsers.SAXParser parser = factory.newSAXParser();
        org.xml.sax.XMLReader reader = parser.getXMLReader();

        reader.setErrorHandler(errorHandler);
        reader.parse(new InputSource("bad.xml"));

The first couple errors are always:

Line Number: 2: Document is invalid: no grammar found. Line Number: 2: Document root element "credits", must match DOCTYPE root "null".

We can't possibly go and edit these thousands of XML files that needed to be checked.

Is there anything I can easily add to the front of the source to prevent this? Is there a way to tell the parser to ignore these DTD related errors? Not even sure what the grammar one means. I sort of understand what the second one means.

Upvotes: 2

Views: 3340

Answers (3)

Michael Kay
Michael Kay

Reputation: 163322

Setting setValidating(true) requests DTD validation and causes a failure if no DTD exists. If you only want schema validation and not DTD validation then use setValidating(false). From the Javadoc for setValidating():

To use modern schema languages such as W3C XML Schema or RELAX NG instead of DTD, you can configure your parser to be a non-validating parser by leaving the setValidating(boolean) method false, then use the setSchema(Schema) method to associate a schema to a parser.

Upvotes: 8

daiscog
daiscog

Reputation: 12057

You can still use a validating parser and you don't need to preset the schema in the parser, if you are using a JAXP-compliant parser and you configure it correctly as per the Oracle documentation:

SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespaceAware(true);
spf.setValidating(true);
SAXParser saxParser = spf.newSAXParser();
// Important step next:  Tell the parser which XML schema-definition language to expect:
saxParser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema");
// Now when we parse a file without a DTD, we no longer get an error 
// (as long as an XSD schema is defined in the file):
saxParser.parse(source, handler);

Upvotes: 0

Leo
Leo

Reputation: 75

in these dais I had the same problem; I found this thread looking for a solution. My solution was to use an EntityResolver. Seems like set the Schema is not enought... not for me at least. This is an EntityResolver example:

public class CustomResolver implements EntityResolver {
    @Override
    public InputSource resolveEntity(String publicId, String systemId) 
            throws SAXException, IOException {

        if (systemId.equals("http://namespace1.example.com/ex1")) {
            return new InputSource("xsd_for_namespace1_path"));
        } else if (systemId.equals("http://namespace2.example.com/ex2")) {
            return new InputSource("xsd_for_namespace2_path"));
        } else if (systemId.equals("http://namespace3.example.com/ex3")) {
            return new InputSource("xsd_for_namespace3_path")); 
        }

        return null;
    }
}

I disable the setValidating() property too. This is my parser configuration:

SAXParserFactory saxpf = SAXParserFactory.newInstance();
saxpf.setNamespaceAware(true);
saxpf.setSchema(getSchema());
saxpf.setValidating(false);
SAXParser saxParser = saxpf.newSAXParser();
saxParser.getParser().setEntityResolver(new XSDResolver());

The method getSchema() instantiate a Schema like you do in your code but with more sources.

I hope that it can help who found that same error.

Upvotes: 0

Related Questions