Arth
Arth

Reputation: 361

Blacklist/whitelist for XSS

I need to implement XSS defence and I have troubles with it. I read this cheatsheet https://cheatsheetseries.owasp.org/cheatsheets/Cross_Site_Scripting_Prevention_Cheat_Sheet.html There are a lot of valuable information, but Its fairly difficult for me to implement this. I understand, you need to escape untrusted data, I already implemented it in my application, but I need to also implement some kind of blacklist/whitelist, right? What is allowed in the data and what is not allowed. I tried to use this code on my server side (which is java), but I would need to something similiar on the front-end side. I am using core javascript and jquery

  1. Is it a good approach?
  2. Is there any library which could help me to build blacklist/whitelist?
  3. Or how can I make sure that escaped data does not contain for example javascript: etc?
  4. I found this library for escaping characters. https://github.com/YahooArchive/xss-filters/wiki Is it ok to use it?

        // Avoid anything between script tags
        Pattern scriptPattern = Pattern.compile("<script>(.*?)</script>", Pattern.CASE_INSENSITIVE);
        value = scriptPattern.matcher(value).replaceAll("");
    
        // Avoid anything in a src='...' type of expression
        scriptPattern = Pattern.compile("src[\r\n]*=[\r\n]*\\\'(.*?)\\\'", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
        value = scriptPattern.matcher(value).replaceAll("");
    
        scriptPattern = Pattern.compile("src[\r\n]*=[\r\n]*\\\"(.*?)\\\"", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
        value = scriptPattern.matcher(value).replaceAll("");
    
        // Remove any lonesome </script> tag
        scriptPattern = Pattern.compile("</script>", Pattern.CASE_INSENSITIVE);
        value = scriptPattern.matcher(value).replaceAll("");
    
        // Remove any lonesome <script ...> tag
        scriptPattern = Pattern.compile("<script(.*?)>", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
        value = scriptPattern.matcher(value).replaceAll("");
    
        // Avoid eval(...) expressions
        scriptPattern = Pattern.compile("eval\\((.*?)\\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
        value = scriptPattern.matcher(value).replaceAll("");
    
        // Avoid expression(...) expressions
        scriptPattern = Pattern.compile("expression\\((.*?)\\)", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
        value = scriptPattern.matcher(value).replaceAll("");
    
        // Avoid javascript:... expressions
        scriptPattern = Pattern.compile("javascript:", Pattern.CASE_INSENSITIVE);
        value = scriptPattern.matcher(value).replaceAll("");
    
        // Avoid vbscript:... expressions
        scriptPattern = Pattern.compile("vbscript:", Pattern.CASE_INSENSITIVE);
        value = scriptPattern.matcher(value).replaceAll("");
    
        // Avoid onload= expressions
        scriptPattern = Pattern.compile("onload(.*?)=", Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
        value = scriptPattern.matcher(value).replaceAll("");
    

Upvotes: 2

Views: 9948

Answers (3)

Max
Max

Reputation: 437

This may help on frontend, you can use this on backend as well.

String.prototype.preventXss = function () {
  const blackList = /['</>']/g;
  return this.replace(blackList, '');
};

let mystring = '<h1>';
console.log(mystring.preventXss());

Upvotes: 0

fgb
fgb

Reputation: 18559

Blacklists don't really work. They can only cover attacks that match a pattern that the programmer has already thought of, but there are new variations and techniques found all the time. See XSS Filter Evasion Cheat Sheet for example.

Browsers like Chrome have made a big effort to try to detect XSS with their filters but even then they're now planning on removing it because it was full of holes and blocked legitimate input (Google to remove Chrome's built-in XSS protection)

The filters you've found are particularly bad. There are obvious omissions, like it looks for onload but not onmouseover. It doesn't deal with nested values, so <vbscript:script>alert(1)</scriptvbscript:> becomes <script>alert(1)</script>. There are also many types of XSS attack these kind of filters can't detect like if multiple parameters are used together.

Instead, focus on the OWASP Rules on that cheatsheet. Here, there are some uses of whitelists but they are easier to implement because they're looking for specific known values. So Rule #7 (Avoid JavaScript URLs), can be implemented by looking for a 'http:' or 'https:' at the beginning of any URL you're outputting. Rule #6 (Sanitize HTML Markup) can be implemented with another library which is configured to only allow specific tags and values.

The Yahoo library looks reasonable enough for escaping, but it looks like it's not maintained anymore. Their approach with escaping the minimum possible characters for performance reasons requires more methods that some other libraries and you need to be more careful to use the exact right method for each context (like inSingleQuotedAttr vs inDoubleQuotedAttr). Instead, I'd use a library that escapes at least &, <, >, ", ' for their HTML escaping and then a lot of these methods could be merged together.

With JavaScript, most modern templating languages will escape values by default, or you can stick to the text based DOM methods like $().attr() and $().text() instead of $().html(), then there's not as much of a need for an external escaping library.

Upvotes: 3

Igor Servulo
Igor Servulo

Reputation: 371

You can add text escaping to block reflected XSS attacks, but you should really consider the implementation of security headers on your web server to block stored XSS attacks.

Check out the CSP Security Header for a detailed explanation and documentation on how to implement it. If you use something like NGINX it's pretty easy to implement.

If you have any doubt on the difference between this XSS attacks please let me know.

Upvotes: 1

Related Questions