Daniel Buckle
Daniel Buckle

Reputation: 628

Xss Filter Regex catching the wrong words

I've found multiple guides on how to implement an xss filter by using different regular expressions to pick out scripting. But I've found a flaw in the one which evaluates the eval() tag. This regex eval.*?\((.*?)\) will pick out the eval tag but also picks out words like evaluate or medieval.

Any ideas on how I can make this regex better?

Upvotes: 0

Views: 364

Answers (2)

Erlend
Erlend

Reputation: 4416

This filter is very likely flawed in several other ways. First of all it doesn't have to be eval("something"). It can also be evalx("something") where x can be ascii 9, 10, 11, 12, 13 or 32 (and possibly other unicode values as well). So for instace eval ("something") still runs. Secondly it could be window["eval"]("something") or window["EVAL".toLowerCase()]("something") or window["e" + "val"]("something"), or window["ev\61l"]("something") and so on.

Stopping XSS through input validation is very hard, because it depends on where the data is output (the context). See the OWASP XSS Prevention Cheat Sheet for examples.

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627044

The regex matches more than expected because there is no word boundary check on the left and the lazy dot matching pattern on the right allows any zero or more characters other than a newline.

So to only match eval(...), use

\beval\((.*?)\)

or even

\beval\(([^()]*)\)

Upvotes: 1

Related Questions