David Thielen
David Thielen

Reputation: 32934

Saxon: Can't open XML with schema in .NET, works fine in Java

I am trying to create a Saxon XPathCompiler. I have the same code in Java & .NET, each calling the appropriate Saxon library. The code is:

protected void ctor(InputStream xmlData, InputStream schemaFile, boolean preserveWhiteSpace) throws SAXException, SchemaException, SaxonApiException {
    this.rootNode = makeDataSourceNode(null);
    XMLReader reader = XMLReaderFactory.createXMLReader();

    InputSource xmlSource = new InputSource(xmlData);
    SAXSource saxSource = new SAXSource(reader, xmlSource);
    Source schemaSource = new StreamSource(schemaFile);
    Configuration config = createEnterpriseConfiguration();
    config.addSchemaSource(schemaSource);
    // ...

In the case of .NET the InputStreams are a class that wrpas a .NET Stream and makes it a Java InputStream. For Java the above code works fine. But in .NET, the last line, config.addSchemaSource(schemaSource) throws:

$exception {"Content is not allowed in prolog."} org.xml.sax.SAXParseException

In both Java & .NET it works fine if there is no schema.

The files it is using are http://www.thielen.com/test/SouthWind.xml & http://www.thielen.com/test/SouthWind.xsd

It does not appear to be any of the issues in this question. And if that was the issue, shouldn't both Java and .NET have the same problem.

I'm thinking maybe it's the wrapper around the .NET Stream to make it a Java InputStream, but we use that class everywhere without any other issues.

Upvotes: 2

Views: 40

Answers (1)

Michael Kay
Michael Kay

Reputation: 163312

The "content is not allowed in Prolog" exception is absolutely infuriating - if only it told you what the bytes are that it is complaining about! One diagnostic technique is to display the initial bytes delivered by the InputStream: do a few calls on

System.err.println(schemaFile.next())

My first guess as to the cause would be something to do with byte order marks, but rather than speculate, I would focus on diagnostics to see what the parser is seeing in that InputStream that it doesn't like.

Upvotes: 1

Related Questions