S K padala
S K padala

Reputation: 251

Not able to write xml with iso encoding to Marklogic via JAVA API

We are trying to insert an xml with ISO encoding to MarkLogic through JAVA API but gets this error. The xml contains special characters, for example: registered trademark sign - <h4> ® </h4>

Bad Request. Server Message: XDMP-DOCUTF8SEQ: Invalid UTF-8 escape sequence at  line 14145 -- document is not UTF-8 encoded. 

Code:

DatabaseClient client = DatabaseClientFactory.newClient(IP, PORT,
                DATABASE_NAME, USERNAME, PWD, Authentication.DIGEST);
            // acquire the content
            InputStream xmlDocStream = XMLController.class.getClassLoader()
                    .getResourceAsStream("path to xml file");

            // create a manager for XML documents
            XMLDocumentManager xmlDocMgr = client.newXMLDocumentManager();

            // create a handle on the content
            InputStreamHandle xmlhandle = new InputStreamHandle(xmlDocStream);

            // write the document content
            xmlDocMgr.write("/" + filename, xmlhandle);

Upvotes: 2

Views: 340

Answers (1)

ehennum
ehennum

Reputation: 7335

Sravan:

The solution is to specify the current ISO encoding when you read the resource by wrapping the input stream in an InputStreamReader:

http://docs.oracle.com/javase/8/docs/api/java/io/InputStreamReader.html#InputStreamReader-java.io.InputStream-java.lang.String-

The Java API converts to UTF-8 when it knows that the content has a different encoding but otherwise assumes that the content is already UTF-8. For more detail about conversion of encoding, see:

http://docs.marklogic.com/guide/java/document-operations#id_11208

Hoping that helps,

Upvotes: 2

Related Questions