johnny 5
johnny 5

Reputation: 20997

Preventing XSS using Regex

I'm using some regex to sanitize tags from text

static string Pattern = "<(?:[^>=]|='[^']*'|=\"[^\"]*\"|=[^'\"][^\\s>]*)*>";

static public string StripHtml(string Value)
{
    return Regex.Replace(Value, Pattern, string.Empty);
}

Although this seems pretty secure, I'm wondering if it really is? Is there a way to execute XSS without using tags?

Would it be better to use a markdown editor, or is that still going to have similar issues because they allow tags as well?

Or should I just manually parse the tags I want and allow them to put what ever?

Upvotes: 1

Views: 4499

Answers (2)

avgvstvs
avgvstvs

Reputation: 6315

You didn't specify which language of ESAPI you're using, but regex is 100% the wrong solution to implement if you need to accept HTML into your application. This is because HTML is a context free language and regular expressions cannot parse it.

You want something like OWASP's HTML Sanitizer or although it hasn't been updated in some time, Antisamy. This is backed by an actual HTML parser, and allows you to specify legal tags and THEN specify regex's for legal content within them.

Also note, it is much more important for you to make sure your application has successfully implemented output-escaping before you worry about HTML sanitation. You can ignore XSS validation entirely if you properly escape for every context. (The reverse, is not true.)

Upvotes: 2

ManthanB
ManthanB

Reputation: 414

You can use ESAPI, it will help you to prevent XSS as well as other security vulnerabilities. There are some validation already there and regex is also defined for that. But if you wants your customize regex then you have to defined it explicitly.

Upvotes: 1

Related Questions