trinity
trinity

Reputation: 21

GWT SimpleHtmlSanitizer

Here is what I am trying to do. We are using Klockwork static analysis tools to determine weaknesses in our GWT application.

We have an ExportServlet that exports information. Before writing it out we construct the HttpResponse like so,

final String designName = req.getParameter("designName");
resp.setHeader("Content-Disposition", "attachment; filename=\"" + designName + ".xls\"");         
resp.setContentType("text/html;charset=UTF-8");
resp.setContentLength(output.length());
final ServletOutputStream out = resp.getOutputStream();
out.print(output);
out.flush();

Klockwork complains that since designName is being used on the response Header it needs to be validated for XSS vulnerabilities. In particular it suggests to remove carriage returns and line feeds from the input. Will GWT's SimpleHtmlSanitizer.sanitize(input) remove those characters?

Upvotes: 2

Views: 856

Answers (3)

vilo
vilo

Reputation: 11

String concatenated in different contexts should be sanitized in different ways. SimpleHtmlSanitizer sanitizes strings about to be inserted into html code, using it here is wrong. For Content-Disposition header, I use the following method. It does support international characters in file names, and escapes all special characters, including space and newlines.

private static final char[] HEX_CHARS = "0123456789abcdef".toCharArray();

public static String encodeContentDispositionFilename(String fname) {
    byte[] bytes;
    try {
        bytes = fname.getBytes("utf-8");
    } catch (UnsupportedEncodingException e) {
        throw new RuntimeException(e);
    }
    int i=0, l=bytes.length;
    StringBuilder sb = new StringBuilder();
    sb.append("filename*=UTF-8''");
    while (i<l) {
        if (bytes[i] >= '0' && bytes[i] <= '9' || bytes[i] >= 'a' && bytes[i] <= 'z' || bytes[i] >= 'A' && bytes[i] <= 'Z'
                || bytes[i] == '.' || bytes[i] == '_')
            sb.append((char)bytes[i]);
        else
            sb
                .append('%')
                .append(HEX_CHARS[(bytes[i] >> 4) & 0xf])
                .append(HEX_CHARS[bytes[i] & 0xF]);
        i++;
    }

    return sb.toString();
}

Now rewrite your code like this:

final String designName = req.getParameter("designName");
final String escaped = encodeContentDispositionFilename(designName + ".xls");
resp.setHeader("Content-Disposition", "attachment; " + escaped);         
// ...

For information about various escape contexts, see http://googleonlinesecurity.blogspot.com/2009/03/reducing-xss-by-way-of-automatic.html.

Upvotes: 1

Tahir Akhtar
Tahir Akhtar

Reputation: 11625

The CR/LF removal requirement of clockworks is most likely for safeguarding against HTTP header injection vulnerability. CR/LF are completely safe as far as HTML is concerned, therefore HTML sanitizers will always leave them unchanged.

You should probably create an Http header sanitization method that removes CR/LF. I don't think that any browser will execute a script embedded in an http header so doing html sanitization on headers might not give any benefit.

Upvotes: 2

Jason Terk
Jason Terk

Reputation: 6025

You may be better off using SafeHtmlUtils#htmlEscape or SafeHtmlUtils#htmlEscapeAllowEntities.

Upvotes: 0

Related Questions