Reputation: 628
I've found multiple guides on how to implement an xss filter by using different regular expressions to pick out scripting. But I've found a flaw in the one which evaluates the eval() tag. This regex eval.*?\((.*?)\)
will pick out the eval tag but also picks out words like evaluate or medieval.
Any ideas on how I can make this regex better?
Upvotes: 0
Views: 364
Reputation: 4416
This filter is very likely flawed in several other ways. First of all it doesn't have to be eval("something")
. It can also be evalx("something")
where x
can be ascii 9, 10, 11, 12, 13 or 32 (and possibly other unicode values as well). So for instace eval ("something")
still runs. Secondly it could be window["eval"]("something")
or window["EVAL".toLowerCase()]("something")
or window["e" + "val"]("something")
, or window["ev\61l"]("something")
and so on.
Stopping XSS through input validation is very hard, because it depends on where the data is output (the context). See the OWASP XSS Prevention Cheat Sheet for examples.
Upvotes: 0
Reputation: 627044
The regex matches more than expected because there is no word boundary check on the left and the lazy dot matching pattern on the right allows any zero or more characters other than a newline.
So to only match eval(...)
, use
\beval\((.*?)\)
or even
\beval\(([^()]*)\)
Upvotes: 1