Reputation: 10250
I am validating XML documents against a schema. Some more complex documents/schemas always fail when trying to validate them using this code:
DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
dbfac.setNamespaceAware(true);
dbfac.setIgnoringElementContentWhitespace(true);
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.parse("sampleResponse.xml");
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Source schemaSource = new StreamSource(getClass().getResourceAsStream("/" + "SampleResponse.xsd"));
Schema schema = schemaFactory.newSchema(schemaSource);
Validator validator = schema.newValidator();
Source source = new DOMSource(doc);
// Set a custom error handler that simple re-throws every exception
validator.setErrorHandler(new ValidationErrorHandler());
validator.validate(source);
The problem is this line:
Source schemaSource = new StreamSource(getClass().getResourceAsStream("/" + "SampleResponse.xsd"));
If I read the schema as a file, it works:
Source schemaSource = new StreamSource(new File("somepath/SampleResponse.xsd"));
Why doesn't validation work when I get the schema directly from classpath?
(Using Java 1.6 on Windows 7 64-bit)
Exception message when failing:
Could not validate against schema SampleResponse.xsd. Nested exception: src-resolve: Cannot resolve the name 'oa:Attachments' to a(n) 'element declaration' component.
Upvotes: 7
Views: 8681
Reputation: 4007
For posterity, here is something I did in Scala, inspired by Joman68's answer https://stackoverflow.com/a/50518995/434405, that does not use spring libs.
import javax.xml.XMLConstants
import javax.xml.transform.Source
import javax.xml.transform.stream.StreamSource
import javax.xml.validation.{Schema, SchemaFactory, Validator}
object SchemaCheck extends App {
private val xsds = List("schema.xsd") // add more as required
private val schemaDocuments: Array[Source] = xsds.map { xsd =>
val res = getClass.getResource(s"/$xsd")
val dtd = new StreamSource(res.toURI.toString)
dtd.setSystemId(res.toURI.toString)
dtd
}.toArray
private val sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
private val s: Schema = sf.newSchema(schemaDocuments)
private val v: Validator = s.newValidator()
private val instanceDocument: Source = new StreamSource(new java.io.File("test.xml"))
v.validate(instanceDocument)
}
Upvotes: 0
Reputation: 2850
I found that I did not need to implement an EntityResolver to make relative URLs resolvable from the classpath.
It was sufficient to set the system id to the URI of the classpath resource.
The following is a worked example that uses Spring to build a list of StreamSources from .xsd files on the classpath.
import org.springframework.core.io.Resource;
import org.springframework.core.io.support.PathMatchingResourcePatternResolver;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
PathMatchingResourcePatternResolver patternResolver = new PathMatchingResourcePatternResolver();
Resource[] theResources = patternResolver.getResources("classpath:schemas/**/*.xsd");
List<Source> sources = new ArrayList<>();
for (Resource resource: theResources) {
StreamSource dtd = new StreamSource(resource.getInputStream());
dtd.setSystemId(resource.getURI().toString());
sources.add(dtd);
The patternResolver is given a pattern of classpath:schemas/**/*.xsd
which allows it to recursively find all .xsd files in the schemas directory on the classpath.
The .xsd files can import other .xsd files using relative paths. For example a .xsd file could include an import like this:
<xsd:import namespace="urn:www.example.com/common" schemaLocation="./common.xsd">
This line:
dtd.setSystemId(resource.getURI().toString());
is the key to having the relative paths in the .xsd files resolved by the schema validator.
The StreamSource array (sources) built above can now be used to set the schema sources for XML validation:
import org.xmlunit.builder.Input;
import org.xmlunit.validation.Languages;
import org.xmlunit.validation.Validator;
import javax.xml.transform.Source;
Validator v = Validator.forLanguage(Languages.W3C_XML_SCHEMA_NS_URI);
v.setSchemaSources(sources.toArray(new Source[sources.size()]));
Source input = Input.fromByteArray(xmlBytes).build();
v.validateInstance(input);
The validateInstance method call validates the XML represented by the xmlBytes array.
Upvotes: 2
Reputation: 8885
When you pass a File to StreamSource, the InputStream is set to the contents of the file, but also the systemId is set to be the URL of the File. This allows relative URIs in your schema to be resolved. If your schema has any relative URLs, this is definitely your problem. To make those relative URLs resolvable when reading the schema from the classpath, you need to implement an EntityResolver. If you don't use relative URIs there might still be other more subtle impacts of the systemId being null. I would recommend using the constructor
StreamSource(InputStream inputStream, String systemId)
Try setting systemId to: null, the File containing the schema, some other file, a File that doesn't exist. That might give you a hint of what Validator is doing with the systemId.
Upvotes: 9