Sharath Madappa
Sharath Madappa

Reputation: 3402

charset issue with XSS api in CQ5 , à being displayed as �

I'm using com.adobe.granite.xss for encoding strings in JSP. It seems to work with most characters, except for Ã. à is displayed as Ã�.

It happens when using xssAPI.encodeForHTML() method. I have tried <cq:text> with escapeXml="true" and it has the same behaviour.

The characters are stored properly in the repository and i have also set content="text/html; charset=utf-8" in the JSP.

Is there a way to encode or filter the input for XSS without the charset breaking in such situations.

I have tried it with different non-latin characters and most of them are not affected by XSS api.

enter image description here

Upvotes: 0

Views: 861

Answers (1)

Oleksandr Tarasenko
Oleksandr Tarasenko

Reputation: 1454

It looks like it's an issue of owasp-esapi-java which is used in CQ's XSSAPI, because it's iterating through string using a charAt() method. But à is outside of BMP so, right way of iterating would be:

final int length = s.length();
for (int offset = 0; offset < length; ) {
   final int codepoint = s.codePointAt(offset);

   // do something with the codepoint

   offset += Character.charCount(codepoint);
}

(form How can I iterate through the unicode codepoints of a Java String?)

So I think that it's an issue of this library.

Try to use xssAPI.filterHTML(), probably it can solve your issue.

Upvotes: 2

Related Questions