Reputation: 20997
I'm using some regex to sanitize tags from text
static string Pattern = "<(?:[^>=]|='[^']*'|=\"[^\"]*\"|=[^'\"][^\\s>]*)*>";
static public string StripHtml(string Value)
{
return Regex.Replace(Value, Pattern, string.Empty);
}
Although this seems pretty secure, I'm wondering if it really is? Is there a way to execute XSS without using tags?
Would it be better to use a markdown editor, or is that still going to have similar issues because they allow tags as well?
Or should I just manually parse the tags I want and allow them to put what ever?
Upvotes: 1
Views: 4499
Reputation: 6315
You didn't specify which language of ESAPI you're using, but regex is 100% the wrong solution to implement if you need to accept HTML into your application. This is because HTML is a context free language and regular expressions cannot parse it.
You want something like OWASP's HTML Sanitizer or although it hasn't been updated in some time, Antisamy. This is backed by an actual HTML parser, and allows you to specify legal tags and THEN specify regex's for legal content within them.
Also note, it is much more important for you to make sure your application has successfully implemented output-escaping before you worry about HTML sanitation. You can ignore XSS validation entirely if you properly escape for every context. (The reverse, is not true.)
Upvotes: 2
Reputation: 414
You can use ESAPI, it will help you to prevent XSS as well as other security vulnerabilities. There are some validation already there and regex is also defined for that. But if you wants your customize regex then you have to defined it explicitly.
Upvotes: 1