Reputation: 1048
I have a java application which exports data from a db ; manipulate some fields and reloads it into another db for testing.
Some fields in the tables which the application uses were encrypted recently and after that the application is throwing an exception when it tries to manipulate the data it exported as an xml file. Below is the stack trace
java.lang.Exception: Error Parsing String
at com.oocl.frm.xmlutil.xmlbeans.XmlBeansUtil.unmarshall(XmlBeansUtil.java:37)
at com.oocl.automation.object.DataSet.<init>(DataSet.java:12)
at com.oocl.automation.process.BaseProcess.process(BaseProcess.java:21)
at com.oocl.automation.TestAutomation.main(TestAutomation.java:30)
Caused by: org.apache.xmlbeans.XmlException: error: Character reference to illegal XML character
org.apache.xmlbeans.impl.piccolo.io.IllegalCharException: Character reference to illegal XML character
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseEncodedChar(PiccoloLexer.java:3131)
at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yylex(PiccoloLexer.java:4899)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yylex(Piccolo.java:1290)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yyparse(Piccolo.java:1400)
at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:714)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3435)
at org.apache.xmlbeans.impl.store.Locale.parse(Locale.java:706)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:690)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:677)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:208)
at org.apache.xmlbeans.XmlObject$Factory.parse(XmlObject.java:579)
at com.oocl.frm.xmlutil.xmlbeans.XmlBeansUtil.unmarshall(XmlBeansUtil.java:35)
at com.oocl.automation.object.DataSet.<init>(DataSet.java:12)
at com.oocl.automation.process.BaseProcess.process(BaseProcess.java:21)
at com.oocl.automation.TestAutomation.main(TestAutomation.java:30)
at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3467)
at org.apache.xmlbeans.impl.store.Locale.parse(Locale.java:706)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:690)
at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:677)
at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:208)
at org.apache.xmlbeans.XmlObject$Factory.parse(XmlObject.java:579)
at com.oocl.frm.xmlutil.xmlbeans.XmlBeansUtil.unmarshall(XmlBeansUtil.java:35)
... 3 more
Caused by: org.xml.sax.SAXParseException; systemId: file:; lineNumber: 39313; columnNumber: 657; Character reference to illegal XML character
Is there some way I can make the xml parser handle or exclude these illegal characters? I couldn't find a proper answer any where .
Any help is greatly appreciated.
My xml version is I tried with 1.1 but it is not working as well. Also the character that is throwing the exception is ; .
Upvotes: 2
Views: 2914
Reputation: 8227
First step is to determine what the illegal characters are and why they were introduced. If the database software is exporting illegal XML characters, you may need to get the vendor to fix it, or you may need to work around the problem by exporting a sanitized version of the field.
If you can't fix the problem at the source, then preprocess the source XML, either by copying and filtering into a separate file, or by creating a filtering stream reader that you can put in front of your XML reader.
Be aware though, that simply discarding illegal characters can have downstream effects as can encoding the characters (i.e., by using URL-encoding or some such).
Upvotes: 1
Reputation: 23637
Your problem seems to be here (line 39313, column 657):
Caused by: org.xml.sax.SAXParseException; systemId: file:; lineNumber: 39313; columnNumber: 657; Character reference to illegal XML character
If you have special chars in your XML and you have parsed it before, it may be a character which is illegal in XML 1.0 (but may not be illegal in XML 1.1 for example - check the version in the XML header of your file, or configure your parser to treat it at XML 1.1).
UPDATE: I see your implementation uses the Piccolo parser. There is a bug reported in 2007 (but it might have already been fixed by now). Anyway, its worth checking which version you are using and the current status of that bug. You might need to use a different parser or ignore the offending chars when parsing.
Upvotes: 1