Java StAX - error when parsing - Illegal character entity: expansion character code 0x19

Question

I am reading/parsing an XML file with javax.xml.stream.XMLStreamReader.
The file contains this piece of XML data as shown below.

Unfortunately I am getting this error and I am not sure how to resolve it.

    Error in downloadXML: 
    com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x19
     at [row,col {unknown-source}]: [674,40]
        at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:606)
        at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:479)
        at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2448)
        at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2395)
        at com.ctc.wstx.sr.StreamScanner.resolveSimpleEntity(StreamScanner.java:1218)
        at com.ctc.wstx.sr.BasicStreamReader.parseAttrValue(BasicStreamReader.java:1929)
        at com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:3063)
        at com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:2961)
        at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2837)
        at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1072)

The problem seems to be with this character .
Of course I can first read the file simply as a text file, and replace this bad character, and only then parse it with XMLStreamReader but:
1) that approach seems really clumsy to me;
2) it will be a bit difficult to do as the code is quite involved there,
so I am not sure if I want to change it just for this character.

Why is the XMLStreamReader unable to handle this character?
Is the XML invalid or the parser has a bug and does not handle it well?

Indent · Accepted Answer

The characters &, < and > (as well as " or ' in attributes) are invalid in XML.

They're escaped using XML entities, in this case you want & for &.

Your XML is invalid with every correct library ; (You need may be correct the producer of this XML content )

**Edit* from https://www.w3.org/TR/xml/#NT-Char

Allowed range for a entity reference :

Reference ::= EntityRef | CharRef 
EntityRef ::=       '&' Name ';'
CharRef   ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]    /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

Java StAX - error when parsing - Illegal character entity: expansion character code 0x19

Answers (2)

Related Questions