user4325449
user4325449

Reputation: 85

XMLStreamException : Parse error

I have a process that parses an xml file with java 5 on apache tomcat 6. Since, I compiled in java 7 with an execution join apache tomcat 7, I receive the following error:

Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,60]
Message: Invalid encoding name "ISO8859-1".
    at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.setInputSource(XMLStreamReaderImpl.java:219)
    at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.(XMLStreamReaderImpl.java:189)
    at com.sun.xml.internal.stream.XMLInputFactoryImpl.getXMLStreamReaderImpl(XMLInputFactoryImpl.java:262)
    at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLStreamReader(XMLInputFactoryImpl.java:129)
    at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLEventReader(XMLInputFactoryImpl.java:78)
    at org.simpleframework.xml.stream.StreamProvider.provide(StreamProvider.java:66)
    at org.simpleframework.xml.stream.NodeBuilder.read(NodeBuilder.java:58)
    at org.simpleframework.xml.core.Persister.read(Persister.java:543)
    at org.simpleframework.xml.core.Persister.read(Persister.java:444)

Here is the xml fragment used:

?xml version="1.0" encoding="ISO8859-1" standalone="no" ?

If I replace ISO8859-1 by UTF-8 the parsing process works but it's not an option for me.

The lib that I use is simple-xml-2.1.8.jar

As someone noticed me, ISO8859-1 is a wrong content type. ISO-8859-1 is the correct one. As I mentioned, it's difficult to ask "producers" to correct their files. I would want to manage the problem in my application.

Upvotes: 4

Views: 4376

Answers (2)

Ian Roberts
Ian Roberts

Reputation: 122364

If you know the file encoding up front (UTF-8, ISO-8859-1 or whatever) then you should create a suitable Reader configured for that encoding, then use the Persister.read method that takes a Reader instead of the one that takes a File or InputStream. That way you are in control of the byte-to-character decoding rather than relying on the XML reader to detect the encoding (and fail, as the file declared it wrongly). So instead of

File f = new File(....);
MyType obj = persister.read(MyType.class, f);

you would do something more like

File f = new File(....);
MyType obj = null;
try( FileInputStream fis = new FileInputStream(f);
     InputStreamReader reader = new InputStreamReader(fis, "ISO-8859-1")) { // or UTF-8, ...
  obj = persister.read(MyType.class, reader);
}

Upvotes: 1

Bruno Grieder
Bruno Grieder

Reputation: 29824

Get access to the Xerces XMLReader instance from Simple XML and set

reader.setFeature("http://apache.org/xml/features/allow-java-encodings", true)

before parsing the XML.

Since ISO8859-1 "works" in Java, this may just work.

The list of supported "features" of Xerces is available here

Alternatively, a good old regex on encoding="ISO8859-1" to fix the XML should do the trick, prior to processing it.

Upvotes: 2

Related Questions