Nuri Ensing
Nuri Ensing

Reputation: 2030

Characters altered by Lotus when receiving a POST through a Java WebAgent with OpenURL command

I have a Java WebAgent in Lotus-Domino which runs through the OpenURL command (https://link.com/db.nsf/agentName?openagent). This agent is created for receiving a POST with XML content. Before even parsing or saving the (XML) content, the webagent saves the content into a in-memory document:

For an agent run from a browser with the OpenAgent URL command, the in-memory document is a new document containing an item for each CGI (Common Gateway Interface) variable supported by Domino®. Each item has the name and current value of a supported CGI variable. (No design work on your part is needed; the CGI variables are available automatically.) https://www.ibm.com/support/knowledgecenter/en/SSVRGU_9.0.1/basic/H_DOCUMENTCONTEXT_PROPERTY_JAVA.html

The content of the POST will be saved (by Lotus) into the request_content field. When receiving content with this character: é, like:

 <Name xml:lang="en">tést</Name>

The é is changed by Lotus to a ?®. This is also what I see when reading out the request_content field in the document properties. Is it possible to save the é as a é and not a: ?® in Lotus?

Solution:

The way I fixed it is via this post:

Link which help me solve this problem

The solution but in Java:

 /****** INITIALIZATION ******/
              session = getSession();
              AgentContext agentContext = session.getAgentContext();

              Stream stream = session.createStream();
              stream.open("C:\\Temp\\test.txt", "LMBCS");
        stream.writeText(agentContext.getDocumentContext().getItemValueString("REQUEST_CONTENT"));
              stream.close();
              stream.open("C:\\Temp\\test.txt", "UTF-8");
              String Content = stream.readText();
              stream.close();
              System.out.println("Content: " + Content);

Upvotes: 3

Views: 1474

Answers (2)

Normunds Kalnberzins
Normunds Kalnberzins

Reputation: 1245

my heart breaks looking at this. I also just passed through this hell, found the old advice, but... I just could not write to disk to solve this trivial matter.

Item item = agentContext.getDocumentContext().getFirstItem("REQUEST_CONTENT");
byte[] bytes = item.getValueCustomDataBytes("");
String content= new String (bytes, Charset.forName("UTF-8"));

Edited in response to comment by OP: There is an old post on this theme: http://www-10.lotus.com/ldd/nd85forum.nsf/DateAllFlatWeb/ab8a5283e5a4acd485257baa006bbef2?OpenDocument (the same thread that OP used for his workaround)

the guy claims that when he uses a particular http header the method fails. Now he was working with 8.5 and using LS. In my case I cannot make it fail by sending an additional header (or in function of the string argument)

How I Learned to Stop Worrying and Love the Notes/Domino: For what it's worth getValueCustomDataBytes() works only with very short payloads. Dependent on content! Starting your text with an accented character such as 'é' will increase the length it still works with... But whatever I tried I could not get past 195 characters. Am I surprised? After all these years with Notes, I must admit I still am...

Well, admittedly it should not have worked in the first place as it is documented to be used only with User Defined Data fields.

Finally Use IBM's icu4j and icu4j-charset packages - drop them in jvm/lib/ext. Then the code becomes:

byte[] bytes = item.getText().getBytes(CharsetICU.forNameICU("LMBCS"));
String content= new String (bytes, Charset.forName("UTF-8"));

and yes, will need a permission in java.policy:

permission java.lang.RuntimePermission "charsetProvider"; 

Is this any better than passing through the file system? Don't know. But kinda looks cleaner.

Upvotes: 2

Richard Schwartz
Richard Schwartz

Reputation: 14628

I've dealt with this before, but I no longer have access to the code so I'm going to have to work from memory.

This looks like a UTF-8 vs UTF-16 issue, but there are up to five charsets that can come into play: the charset used in the code that does the POST, the charset of the JVM the agent runs in, the charset of the Domino server code, the charset of the NSF - which is always LMBCS, and the charset of the Domino server's host OS.

If I recall correctly, REQUEST_CONTENT is treated as raw data, not character data. To get it right, you have to handle the conversion of REQUEST_CONTENT yourself.

The Notes API calls that you use to save data in the Java agent will automatically convert from Unicode to LMBCS and vice versa, but this only works if Java has interpreted the incoming data stream correctly. I think in most cases, the JVM running under Domino is configured for UTF-16 - though that may not be the case. (I recall some issue with a server in Japan, and one of the charsets that came into play was one of the JIS standard charsets, but I don't recall if that was in the JVM.)

So if I recall correctly, you need to read REQUEST_CONTENT as UTF-8 from a String into a byte array by using getBytes("UTF-8") and then construct a new String from the byte array using new String(byte[] bytes, "UTF-16"). That's assuming that Then pass that string to NotesDocument.ReplaceItemValue() so the Notes API calls should interpret it correctly.

I may have some details wrong here. It's been a while. I built a database a long time ago that shows the LMBCS, UTF-8 and UTF-16 values for all Unicode characters years ago. If you can get down to the byte values, it can be a useful tool for looking at data like this and figuring out what's really going on. It's downloadable from OpenNTF here. In a situation like this, I recall writing code that got the byte array and converted it to hex and wrote it to a NotesItem so that I could see exactly what was coming in and compare it to the database entries.

And, yes, as per the comments, it's much better if you let the XML tools on both sides handle the charset issues and encoding - but it's not always foolproof. You're adding another layer of charsets into the process! You have to get it right. If the goal is to store data in NotesItems, you still have to make sure that the server-side XML tools decode into the correct charset, which may not be the default.

Upvotes: 2

Related Questions