Reputation: 92210
I have the following code:
public XsdValidator(Resource... xsds) {
Preconditions.checkArgument(xsds != null);
try {
this.xsds = ImmutableList.copyOf(xsds);
SchemaFactory schemaFactory = SchemaFactory.newInstance(W3C_XML_SCHEMA_NS_URI);
LOGGER.debug("Schema factory created: {}",schemaFactory);
StreamSource[] streamSources = streamSourcesOf(xsds);
LOGGER.debug("StreamSource[] created: {}",streamSources);
Schema schema = schemaFactory.newSchema(streamSources);
LOGGER.debug("Schema created: {}",schema);
validator = schema.newValidator();
LOGGER.debug("Validator created: {}",validator);
} catch ( Exception e ) {
throw new IllegalArgumentException("Can't build XsdValidator",e);
}
}
It seems the line schemaFactory.newSchema(streamSources);
takes a very long time (30 seconds) to execute against my XSD file.
After many tests on this XSD, it seems it's because I have:
<xs:complexType name="entriesType">
<xs:sequence>
<xs:element type="prov:entryType" name="entry" minOccurs="0" maxOccurs="10000" />
</xs:sequence>
</xs:complexType>
The problem is maxOccurs="10000"
With maxOccurs="1"
or maxOccurs="unbounded"
, it is very fast.
Can someone tell me what's the problem of using maxOccurs="10000"
?
Upvotes: 4
Views: 1373
Reputation: 21658
Based on my personal experience, having particles bounded by what some may consider "unreasonably" high values is cause for performance problems (this link is from my browser's favourites).
The underlying cause seems to be memory allocation (to the effect indicated by the maxOccurs value).
Also, I recall a documentation item which was stating a threshold value beyond which, for all intents and purposes, the parser would actually treat the maxOccurs as unbounded, regardless of what the XSD says (I'll revisit this post if I find it).
Upvotes: 4