Reputation: 1559
I have a large XML. It has some characters like ZÖE,DÉCOR CIARÁN in my XML. I am using Java and MarkLogic as my DB. I am unable to read my XML with these words and when I remove these words and check it's working perfectly.
My Java Code:
DatabaseClient client = DatabaseClientFactory.newClient(IP, PORT,
DATABASE_NAME, USERNAME, PWD, Authentication.DIGEST);
XMLDocumentManager docMgr = client.newXMLDocumentManager();
DOMHandle xmlhandle = new DOMHandle();
docMgr.read("/" + filename, xmlhandle);
Changed Question: As i said i was unable to read special chars, now how can i insert the special characters so that while i am reading i get the same format.
Example: When i insert characters like CIARÁN AURÉLIE BARGÈME it is saving but when i read, the data is like this CIARAN AURELIE BARGEME but not as inserted.
DatabaseClient client = DatabaseClientFactory.newClient(IP, PORT,
DATABASE_NAME, USERNAME, PWD, Authentication.DIGEST);
XMLDocumentManager docMgr = client.newXMLDocumentManager();
DOMHandle xmlhandle = new DOMHandle();
docMgr.read("/" + filename, xmlhandle);
String doc = xmlhandle.ToString();
String data = Normalizer.normalize(doc, Normalizer.Form.NFD)
.replaceAll("[^\\p{ASCII}]", "");
Am using Normalizer to read special characters, else normal xmlhandle is fine.
Upvotes: 3
Views: 230
Reputation: 1794
According to their official documentation:
If you specify the encoding and it turns out to be the wrong encoding, then the conversion will likely not turn out as you expect.
MarkLogic Server stores text, XML, and JSON as UTF-8. In Java, characters in memory and reading streams are UTF-16. The Java API converts characters to and from UTF-8 automatically.
When writing documents to the server, you need to know if they are already UTF-8 encoded. If a document is not UTF-8, you must specify its encoding or you are likely to end up with data that has incorrect characters due to the incorrect encoding. If you specify a non-UTF-8 encoding, the Java API will automatically convert the encoding to UTF-8 when writing to MarkLogic.
https://docs.marklogic.com/guide/java/document-operations#id_11208
Upvotes: 3