Reputation: 361

Blacklist/whitelist for XSS

I need to implement XSS defence and I have troubles with it. I read this cheatsheet https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html There are a lot of valuable information, but Its fairly difficult for me to implement this. I understand, you need to escape untrusted data, I already implemented it in my application, but I need to also implement some kind of blacklist/whitelist, right? What is allowed in the data and what is not allowed. I tried to use this code on my server side (which is java), but I would need to something similiar on the front-end side. I am using core javascript and jquery

Is it a good approach?
Is there any library which could help me to build blacklist/whitelist?
Or how can I make sure that escaped data does not contain for example javascript: etc?

I found this library for escaping characters. https://github.com/YahooArchive/xss-filters/wiki Is it ok to use it?

    // Avoid anything between script tags
    Pattern scriptPattern = Pattern.compile("<script>(.*?)</script>", Pattern.CASE_INSENSITIVE);
    value = scriptPattern.matcher(value).replaceAll("");

    // Avoid anything in a src='...' type of expression
    scriptPattern = Pattern.compile("src[\r\n]*=[\r\n]*\\\'(.*?)\\\'", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
    value = scriptPattern.matcher(value).replaceAll("");

    scriptPattern = Pattern.compile("src[\r\n]*=[\r\n]*\\\"(.*?)\\\"", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
    value = scriptPattern.matcher(value).replaceAll("");

    // Remove any lonesome </script> tag
    scriptPattern = Pattern.compile("</script>", Pattern.CASE_INSENSITIVE);
    value = scriptPattern.matcher(value).replaceAll("");

    // Remove any lonesome <script ...> tag
    scriptPattern = Pattern.compile("<script(.*?)>", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
    value = scriptPattern.matcher(value).replaceAll("");

    // Avoid eval(...) expressions
    scriptPattern = Pattern.compile("eval\\((.*?)\\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
    value = scriptPattern.matcher(value).replaceAll("");

    // Avoid expression(...) expressions
    scriptPattern = Pattern.compile("expression\\((.*?)\\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
    value = scriptPattern.matcher(value).replaceAll("");

    // Avoid javascript:... expressions
    scriptPattern = Pattern.compile("javascript:", Pattern.CASE_INSENSITIVE);
    value = scriptPattern.matcher(value).replaceAll("");

    // Avoid vbscript:... expressions
    scriptPattern = Pattern.compile("vbscript:", Pattern.CASE_INSENSITIVE);
    value = scriptPattern.matcher(value).replaceAll("");

    // Avoid onload= expressions
    scriptPattern = Pattern.compile("onload(.*?)=", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
    value = scriptPattern.matcher(value).replaceAll("");

Upvotes: 2

Answers (3)

Max

Reputation: 437

This may help on frontend, you can use this on backend as well.

String.prototype.preventXss = function () {
  const blackList = /['</>']/g;
  return this.replace(blackList, '');
};

let mystring = '<h1>';
console.log(mystring.preventXss());

Upvotes: 0

fgb

Reputation: 18559

Blacklists don't really work. They can only cover attacks that match a pattern that the programmer has already thought of, but there are new variations and techniques found all the time. See XSS Filter Evasion Cheat Sheet for example.

Browsers like Chrome have made a big effort to try to detect XSS with their filters but even then they're now planning on removing it because it was full of holes and blocked legitimate input (Google to remove Chrome's built-in XSS protection)

The filters you've found are particularly bad. There are obvious omissions, like it looks for onload but not onmouseover. It doesn't deal with nested values, so <vbscript:script>alert(1)</scriptvbscript:> becomes <script>alert(1)</script>. There are also many types of XSS attack these kind of filters can't detect like if multiple parameters are used together.

Instead, focus on the OWASP Rules on that cheatsheet. Here, there are some uses of whitelists but they are easier to implement because they're looking for specific known values. So Rule #7 (Avoid JavaScript URLs), can be implemented by looking for a 'http:' or 'https:' at the beginning of any URL you're outputting. Rule #6 (Sanitize HTML Markup) can be implemented with another library which is configured to only allow specific tags and values.

The Yahoo library looks reasonable enough for escaping, but it looks like it's not maintained anymore. Their approach with escaping the minimum possible characters for performance reasons requires more methods that some other libraries and you need to be more careful to use the exact right method for each context (like inSingleQuotedAttr vs inDoubleQuotedAttr). Instead, I'd use a library that escapes at least &, <, >, ", ' for their HTML escaping and then a lot of these methods could be merged together.

With JavaScript, most modern templating languages will escape values by default, or you can stick to the text based DOM methods like $().attr() and $().text() instead of $().html(), then there's not as much of a need for an external escaping library.

Upvotes: 3

Igor Servulo

Reputation: 371

You can add text escaping to block reflected XSS attacks, but you should really consider the implementation of security headers on your web server to block stored XSS attacks.

Check out the CSP Security Header for a detailed explanation and documentation on how to implement it. If you use something like NGINX it's pretty easy to implement.

If you have any doubt on the difference between this XSS attacks please let me know.

Upvotes: 1

Blacklist/whitelist for XSS

Answers (3)

Related Questions