Reputation: 3402
I'm using com.adobe.granite.xss
for encoding strings in JSP. It seems to work with most characters, except for Ã. à is displayed as Ã�.
It happens when using xssAPI.encodeForHTML()
method. I have tried <cq:text>
with escapeXml="true"
and it has the same behaviour.
The characters are stored properly in the repository and i have also set content="text/html; charset=utf-8" in the JSP.
Is there a way to encode or filter the input for XSS without the charset breaking in such situations.
I have tried it with different non-latin characters and most of them are not affected by XSS api.
Upvotes: 0
Views: 861
Reputation: 1454
It looks like it's an issue of owasp-esapi-java which is used in CQ's XSSAPI, because it's iterating through string using a charAt() method. But à is outside of BMP so, right way of iterating would be:
final int length = s.length();
for (int offset = 0; offset < length; ) {
final int codepoint = s.codePointAt(offset);
// do something with the codepoint
offset += Character.charCount(codepoint);
}
(form How can I iterate through the unicode codepoints of a Java String?)
So I think that it's an issue of this library.
Try to use xssAPI.filterHTML(), probably it can solve your issue.
Upvotes: 2